I have some sewing patterns that I would like to share (and hopefully swap) but all of the PDFs have a

“This was purchased by John Doe [email protected] #ordernumber - if you are not John Doe, please dob in the person you got this from to [email protected] so we can sick our lawyers on them”

sorta footer on every single page.

Obviously for privacy reasons (and because I don’t actually want lawyers sicked onto me), I need to remove this footer.

These are often complex PDFs with more than a hundred pages and multiple layers.

I managed to successfully remove the editing password (not user/viewing password, just can’t edit without password) with qpdf --decrypt. But removing that footer has left me at a dead end. I have even tried manually removing every single instance of those footers using Master PDF Editor but saving the file flattened it and you are no longer able to show/hide layers which is essential for correct printing. (Please don’t ask me how many different PDF editors I have tried because it has been so so SO many I have lost count).

Not that I really want to have to manually edit this out on what could amount to over a thousand pages but searching for a command to remove a certain phrase has come up empty. Even Master PDF Editor doesn’t seem to have a bulk remove or search and replace function (just search).

I use Linux btw.

  • Pup Biru@aussie.zone
    link
    fedilink
    English
    arrow-up
    2
    ·
    12 hours ago

    to really hammer home this “many ways to hide”: the PDF is kinda just like a container… it contains other things like images (the patterns for example)… these patterns are probably vector graphics (made up of lines rather than pixels)… this means you can magnify them basically infinitely… and they can contain transparent lines and all sorts of things. they could easily embed that same text in the SVG image, at tiny scale (less than a pixel at 100% scale), and make it transparent… no PDF editor is going to touch the image data: it simply doesn’t really understand it to that degree - it’s an image; not a PDF after all… so that information will remain even after you’ve removed all visible/reasonable marks

    this is just 1 example of practically infinite places it could be - and remember, this text is just lines in an image! it’s not like you can ctrl+f for the text necessarily… you’d have to go through every image manually and inspect every single line, and even then there are no guarantees (perhaps they encoded that information like morse code in bumps in some lines that are only barely visible at 1000% magnification)