Semantic ablation: Why AI writing is boring and dangerous

Powderhorn@beehaw.org · 2 months ago

Semantic ablation: Why AI writing is boring and dangerous

Lvxferre [he/him]@mander.xyz · edit-2 2 months ago

I believe that good communication has four attributes.

It’s approachable: it demands from the reader (or hearer, or viewer) the least amount of reasoning and previous knowledge, in order to receive the message.
It’s succinct: it demands from the reader the least amount of time.
It’s accurate: it neither states nor implies (for a reasonable = non-assumptive receiver) anything false.
It’s complete: it provides all relevant information concerning what’s being communicated.

However no communication is perfect and those four attributes are in odds with each other: if you try to optimise your message for one or more of them, the others are bound to suffer.

Why this matters here: it shows the problem of ablation is unsolvable. Even if generative models were perfectly competent at rephrasing text (they aren’t), simply by asking them to make the text more approachable, you’re bound to lose info or accuracy. Specially in the current internet, where you got a bunch of skibidi readers who’ll screech “WAAAAH!!! TL;DR!!!” at anything with more than two sentences.

I’d also argue “semantic ablation” is actually way, way better as a concept than “hallucination”. The later is not quite “additive error”; it’s a misleading metaphor for output that is generated by the model the same way as the rest, but it happens to be incorrect when interpreted by human beings.

Powderhorn@beehaw.org · 2 months ago

As a former linguistics major, I find this to be horseshit.

Really, we optimize for the least possible amount of communication necessary. With a spouse, you don’t ask full questions. Early on, you might have to shoot a look, but later on? This is now ingrained. They’re offering the solution before you express the problem.

Lvxferre [he/him]@mander.xyz · 2 months ago

To be clear, by “communication” I’m talking about the information conveyed by a certain utterance, while you’re likely referring to the utterance itself.

Once you take that into account, your example is optimising for #2 at the expense of #1 — yes, you can get away conveying info in more succinct ways, but at the expense of requiring a shared context; that shared context is also info the receiver knows beforehand. It works fine in this case because spouses accumulate that shared context across the years (so it’s a good trade-off), but if you replace the spouse with some random person it becomes a “how the fuck am I supposed to know what you mean?” matter.

Powderhorn@beehaw.org · 2 months ago

Sure. That’s a specific use case and not likely a useful one.

When we start getting into utterances, though, we’re firmly in linguistics. Unless you’ve been passing bad checks.

Lvxferre [he/him]@mander.xyz · 2 months ago

Yeah, got to borrow some word from discourse analysis :-P

It fits well what I wanted to say, and it makes the comment itself another example of the phenomenon: that usage of “utterance” as jargon makes the text shorter and more precise but makes it harder to approach = optimises for #2 and #3 at the expense of #1. (I had room to do it in this case because you mentioned your Linguistics major.)

Although the word is from DA I believe this to be related to Pragmatics; my four points are basically a different “mapping” of the Gricean maxims (#1 falls into the maxim of manner, #2 of manner and relation, #3 of quality, #4 of quantity) to highlight trade-offs.

Powderhorn@beehaw.org · 2 months ago

I never got a degree! I got roped into the college paper, and from there, well, I didn’t really care about my studies. Why worry about semantics and semiotics when you can tell 18,000 people what to think?

(yeah, I meandered into news after cutting my teeth in opinion)