Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Never trust a lawyer with a redact tool any more complicated than a marker.

I've seen lawyers at major, high-priced law firms make this same mistake. Once it was a huge list of individuals names and bank account balances. Fortunately I was able to intervene just before the uploaded documents were made public.

Folks around here blame incompetence, but I say the frequency of this kind of cock-up is crystal clear telemetry telling you the software tools suck.

If the software is going to leverage the familiarity of using a blackout marker to give you a simple mechanism to redact text, it should honour that analogy and work the way any regular user would expect, by killing off the underlying text you're obscuring, and any other correponding, hidden bits. Or it should surface those hidden bits so you can see what could come back to bite you later. E.g. It wouldn't be hard to make the redact tool simultaneously act as a highlighter that temporarily turns proximate text in the OCR layer a vibrant yellow as you use it.



It often comes down to not using the right software and training issues. They have to use Acrobat, which has a redaction tool. This is expensive so some places cheap out on other tools that don’t have a real redaction feature. They highlight with black and think it does the same thing whereas the redaction tool completely removes the content and any associated metadata from the document.

This was basically the only reason we were willing to cough up like $400 for each Acrobat license for a few hundred people. One redaction fuckup could cost you whatever you saved by buying something else.

I would like to believe that the DOJ lacking the proper software might have something to do with DOGE. That would be sweet irony.


If my law firm can't afford the $20/month for a copy of Acrobat Pro, I'd be very concerned what else they are cutting corners on.


Law firms are notoriously behind in tech. I’ve seen some shit. A small firm running on the owner’s personal Dropbox account with client matter files stored alongside his porn collection, ancient, unsupported software, unpatched systems, basically zero information security, servers in a bathroom and network switches in a shower, a literal hoarder with garbage and shit in the office, etc. The Dropbox guy was basically a giant in his practice area. Very successful. You have no idea how bad things are behind the scenes.

I think it's usually a bit more complicated, i.e. the people who were expected to do processes don't and someone else shows the people asking for access that there's a faster, cheaper, cooler tool.

This is to be expected from an effort like DOGE simply because the E is for Efficiency. That is, how well a system is performing. The ratio of energy input to output.

Unfortunately the E in DOGE should have been for Effectiveness. That is, is the system shooting at the right target, and how close is it to hitting that target.

You can be very efficient but if you’re doing the wrong thing(s) you’re ultimately wasting resources.

The irony is, DOGE got the E wrong. It’s efficient but not effective


Or it a scam run by someone who wants to get access the social security info on americans. We are in trouble if you think the acronym is the biggest issue

I was speaking to the difference between efficiency and effectiveness. DOGE is simply the current best example.

Putting the obvious aside, sure, it’s Trump’s fault the system was so mismanaged that he’s been able to get elected. Twice. You’d think that after the first term the system would have gotten the message. It did not.

My recommendation to you is ask: How did we get here? And who is accountable for this?

There’s a very good chance those giving you your current narrative marching orders are on that list. Funny, right? Why own their failure when they can convince fools to blame a symptom?


not even, anyone still left at DOJ working to protect the president is immensely corrupt, and this is just that careless stupidity that typically goes along with deeply corrupt people.


I feel like the number of incidents related to "fully public S3 buckets" has gone down after AWS made it nearly impossible to miss the notice.

I think someone just got free marketing materials to promote the redaction tools.

Now much more people will be aware of the issue.


Are you saying that only Adobe PDF has proper redaction tools? I did a quick search and found several open source PDF tools claiming to do redaction- are they all faulty? I would honestly be surprised if there aren't any free tools that do it right.


No that's not what GP is saying. GP is saying that there is software that does not have a redaction feature (perhaps because the developer didn't implement it), but users of the software worked around it by adding a black rectangle to the PDF in such software, falsely believing it to be equivalent to redaction.

Properly implementing redaction is a complicated task. The redaction can be applied to text, so the software needs to find out which text is covered by the rectangle and remove it. The redaction can be applied to images, so the software needs to edit a dizzying array of image formats supported by PDF (including some formats frequently used by PDFs but used basically nowhere else, like JBIG2). The redaction can be applied to invisible text (such as OCR text of a scanned document). The redaction can be applied to vector shapes, so some moderately complicated geometry calculations are needed to break the vector shapes and partially delete them.

It's very easy to imagine having a basic PDF editor that does not have a redaction feature because implementing the feature is hard.

For the same reason, a basic PDF editor does not have a real crop feature. Such an editor adds a cropbox and keeps all the content outside the cropbox.


> Folks around here blame incompetence, but I say the frequency of this kind of cock-up is crystal clear telemetry telling you the software tools suck.

Absolutely. They know this is confusing, and they're bound and determined not to fix it. At the least, they need a pop-up to let you know that it's not doing what you might think it's doing.


Apple’s Preview app does exactly that. I discovered this while trying to make a blanked copy of kid #2’s homework worksheet for kid #1 who left his at school after kid #2 already wrote on her copy.


I’m optimistic that because LLMs have brought down the cost of the mere act of typing out code that we will see a shift in focus on certification and verification. Preferably with some legal protection for customers that are sorely lacking today.


Apple’s Preview app (which has a very thorough PDF markup tool) does this right: it has an explicit “redact” tool which deletes the content it’s used on.


Always worth remembering that PDFs are basically a graphic design format/editor from the 70s. It was never intended for securely redacting documents and while it can be done, that’s not the default behaviour.

No surprise non-experts muck it up and I don’t see that changing until they move to special-purpose tools.


Of course we can blame incompetence. It's incompetent not to realise your own incompetencies, also known as overconfidence.

Any lawyer should be like "I don't know what I'm doing here I'll get an expert to help" just like as a software developer I'd ask a lawyer for their help with law stuff...because IANAL uwu


I think it's part laziness here.

Placing a black rectangle on a PDF is easier than modifying an image or removing text from that same PDF.


The tool in Acrobat is exactly placing black rectangles on stuff. There's a second step you are supposed to do when you are finishing marking the redactions that edits out the content underneath them, and offers to sanitize other hidden data:

https://www.adobe.com/acrobat/resources/how-to-redact-a-pdf....

That failed redactions happen over and over and over is kind of amazing.


I hope you're not blaming the users. It's understandable they would be confused. The software needs to clarify it for the user. Perhaps, when you try to save it, it should warn you that it looks like you tried to redact text, and that text is still embedded in the document and could be extracted. And then direct you to more information on how to complete the redaction.


We have 30 years direct evidence that the users would ignore that warning, complain about the computer warning them too much, insist that the warning is entirely unnecessary, and then release a document with important information unredacted.

The problem is that the user generally doesn't have a functioning mental model of what's actually going on. They don't think of a PDF as a set of rendering instructions that can overlap. They think it's paper. Because that's what it pretends to be.

The best fix for this in almost any organization is the one that untrained humans will understand: After you redact, you print out and scan back in. You have policy that for redacted documents, they must be scanned in of a physical paper.


The problem is that the user generally doesn't have a functioning mental model of what's actually going on

Sorry, but a professional user not having an operational understanding of the tools they're working with is called culpable negligence in any other profession. A home user not knowing how MS Word works is fine, but we're talking desk clerks whose primary task is document management, and lawyers who were explicitly tasked with data redaction for digital publication. I don't think we should excuse or normalize this level of incompetence.


I don't expect radiologists to have a good understanding of the software involved in the control loops for the equipment they operate. Why should a lawyer have to have a mental model or even understand how the pdf rendering engine works?

Have you ever had to actually react a document in acrobat pro? It's way more fiddly and easy to screw up than one would expect. Im not saying professionals shouldn't learn how to use their tools, but the UI in acrobat is so incredibly poor that I completely understand when reaction gers screwed up. Up thread there's an in complete but very extensive list of this exact thing happening over and over. Clearly there's a tools problem here. Actual life-critical systems aren't developed this way, if a plane keeps crashing due to the same failure we don't blame the pilot. Boeing tried to do that with the max, but they weren't able to successfully convince the industry that that was OK.


if a plane keeps crashing due to the same failure we don't blame the pilot

That's true, we blame the manufacturer and demand that they fix their product under threat of withdrawing the airworthiness certification. So where's the demand for Adobe to fix its software, under pain of losing their cash cow?

Yet, people here are arguing that it is perfectly OK that professionals keep working with tools that are apparently widely known to be inappropriate for their task. Why should we not blame the lawyers that authorized the use of inappropriate tooling for such a sensitive task as legal redaction of documents?


The link in the comment you are replying to has a screenshot of exactly this. it’s a prompt with a checkbox asking you to delete the metadata and hidden info involved with the redaction. you’d have to blaze past that and not read it to make this mistake. It is user error.

I guess if you really want to defend users here you could say people are desensitized so much by popup spam that a popup prompt is gonna just be click through’d so fast the user probably barely recognizes it, but that’s not the software’s problem. For whatever reason some users would prefer to just put black boxes over obfuscated text, so here we are


Professional users doing more than 1 document? Yes, I'm absolutely blaming them.

I agree that affordances are good, but tools are tools, they can have rough edges, it's okay that it occasionally takes more than zero knowledge and attention to use them.


I hope you're not blaming the users.

If software developers designed hammers, you'd have to twist the handle before each swing to switch from tack to nail mode. And the two heads would be indistinguishable from each other.

If business MBA's designed them, you'd wind up with the SaaSy Claw 9000, free for the first month then $9.95 in recurring subscription fees, and compatible only with on-brand nails that each have a different little ad imprinted on the head.

But it doesn't matter, because by the end of the year all construction will be vibe-built from a single prompt to Clawde.ai, which will pound non-stop, burning through $1T of investor funds, and confidently hallucinate 70% of the nails until the roof collapses on the datacenter destroying the machine and civilization along with it, and a post-singularity survivor picks up a rock and looks calculatingly at a pointy shard of metal...


JIT dual hardware and software design and manifestation

The software could do better, sure, but in this case the accountability clearly falls on the lawyers. It's their job - and it's a job that can profoundly impact people's lives, so they need to take it seriously - to redact information properly.


Adobe's contempt for users strikes again.


The consequences of fucking it up are low, too.

If they get caught, they just take the document down and deny it ever got posted. Claim whatever people can show is a fake.

Since they control the levers of government, there's few with the resources and appetite for holding them accountable. So far, we haven't un-redacted anything too damning, so push hasn't come to shove yet.

The only might change if there's a "blue wave" in the midterms, but even then I wouldn't count on it.


I’ve not looked too deeply, but based on other discussion, I wonder if this was malicious noncompliance meant to reveal what the higher-ups were ordering hidden. If victims’ names are properly redacted that would be strong evidence.


It is more likely they have no conceptual understanding that the PDF is a file format. They likely assume that whatever is shown in the interface is what is exported.

> Never trust a lawyer with a redact tool any more complicated than a marker.

there's white-out on my monitor.

> ...frequency of this kind of ...

sometimes I wonder if it is plausible deniability. Like people don't WANT to cover this up and do it in a certain way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: