Just as the community adopted the term “hallucination” to describe additive errors, we must now codify its far more insidious counterpart: semantic ablation.
Semantic ablation is the algorithmic erosion of high-entropy information. Technically, it is not a “bug” but a structural byproduct of greedy decoding and RLHF (reinforcement learning from human feedback).
During “refinement,” the model gravitates toward the center of the Gaussian distribution, discarding “tail” data – the rare, precise, and complex tokens – to maximize statistical probability. Developers have exacerbated this through aggressive “safety” and “helpfulness” tuning, which deliberately penalizes unconventional linguistic friction. It is a silent, unauthorized amputation of intent, where the pursuit of low-perplexity output results in the total destruction of unique signal.
When an author uses AI for “polishing” a draft, they are not seeing improvement; they are witnessing semantic ablation. The AI identifies high-entropy clusters – the precise points where unique insights and “blood” reside – and systematically replaces them with the most probable, generic token sequences. What began as a jagged, precise Romanesque structure of stone is eroded into a polished, Baroque plastic shell: it looks “clean” to the casual eye, but its structural integrity – its “ciccia” – has been ablated to favor a hollow, frictionless aesthetic.
Over and over its the same story. AI makes everything dumber.
This is a good name for one of the main reasons I’ve never really felt a desire to have an LLM rephrase/correct/review something I’ve already written. It’s the reason I’ve never used Grammarly, and turned off those infuriating “phrasing” suggestions in Microsoft Word that serve only to turn a perfectly legible sentence into the verbal equivalent of Corporate Memphis.
I’m not a writer, but lately I often deliberately edit myself less than usual, to stay as far as possible from the semantic “valley floor” along which LLM text tends to flow. It probably makes me sound a bit unhinged at times, but hey at least it’s slightly interesting to read.
I do wish the article made it clear if this is an existing term (or even phenomenon) among academics, something the author is coining as of this article, or somewhere in between.
GPT-4o mini, “Rephrase the below text in a neutral tone”:
This name is appropriate for one key reason: I have not felt the need to use an LLM for rephrasing, correcting, or reviewing my writing. This is also why I have not utilized Grammarly and have disabled the “phrasing” suggestions in Microsoft Word, which often transform a clear sentence into something overly corporate or generic.
Although I wouldn’t categorize myself as a writer, I have been intentionally editing myself less than usual lately to avoid the typical style associated with LLM-generated text. This approach might come across as unconventional at times, but it can also make for more engaging reading.
I also wish the article clarified whether this term is already established in academic circles, if the author is introducing it for the first time, or if it falls somewhere in between.
“avoid the typical style associated with LLM-generated text” – slop!
Wow that gpt rewrite is awful. Not just bland as hell but it also changed the meaning. The first sentence is very different.
That’s a fine illustration of the problem, whatever it’s properly called.
Having paused to search the web I find that “ablation” according to wikipedia is a term used in AI since 1974. Arxiv.org has a recent paper talking specifically about “semantic ablation” which phrase it uses to describe an operation deliberately removing semantic information from an LLM’s representation of a sentence in an attempt to see what purely syntactical information is left over afterwards, or something like that.
Interesting, thanks for doing the research!
As an extreme non-expert, I would say “deliberate removal of a part of a model in order to study the structure of that model” is a somewhat different concept to “intrinsic and inexorable averaging of language by LLM tools as they currently exist”, but they may well involve similar mechanisms, and that may be what the OP is referencing, I don’t know enough of the technical side to say.
That paper looks pretty interesting in itself; other issues aside, LLMs are really fascinating in the way they build (statistical) representations of language.
I’m not sure if that writer gets all the details right when it comes to how it works, but I do like “semantic ablation.” It’s good to finally have a name for that after we’ve already seen so much of it.
It’s statistical blandness writ large.
The stack of single-sentence paragraphs after the introduction paragraph trying so hard to have an impact.
The tendency to put “not X, not Y, just Z” everywhere.
The perfect conclusion written at the end of each piece , summarising three bland paragraphs with yet another bland paragraph.
Statistically regurgitated bullshit, all of it
A stack of single-sentence paragraphs, you say?
With a perfect conclusion written at the end you say?
Methinks I’ve seen this before somewhere, I say.
Dare to be different, I say.
Ha, If you’re alluding to my post being similar to generated output, you obviously haven’t experienced the pure blandness of LLMs trying to write engaging content.
I wondered if what I said would come across as criticism - even though I took care to avoid alluding to your comment NOT being statistically bland (which ironically, due to your third point, would have begun to imply that it WAS, despite my saying explicitly the opposite).
So we are proving real-time why LLMs go to such lengths to be bland - their goal of
not offending anyonemaking their shareholders more money does not allow them to take those kinds of risks, as I just did above.All the more so with their child-like yet incurious audience noping out at the first hint of difficulty
understandingproducing dopamine upon reading anything at all - not attempting clarification or expounding additional details as just you did.So kudos I suppose we just proved our humanity? Now to do that 10k times a day for the rest of our natural lives…
Could have just said popularity breeds mediocrity and it works on that level, but I appreciate this term too.
I believe that good communication has four attributes.
- It’s approachable: it demands from the reader (or hearer, or viewer) the least amount of reasoning and previous knowledge, in order to receive the message.
- It’s succinct: it demands from the reader the least amount of time.
- It’s accurate: it neither states nor implies (for a reasonable = non-assumptive receiver) anything false.
- It’s complete: it provides all relevant information concerning what’s being communicated.
However no communication is perfect and those four attributes are in odds with each other: if you try to optimise your message for one or more of them, the others are bound to suffer.
Why this matters here: it shows the problem of ablation is unsolvable. Even if generative models were perfectly competent at rephrasing text (they aren’t), simply by asking them to make the text more approachable, you’re bound to lose info or accuracy. Specially in the current internet, where you got a bunch of skibidi readers who’ll screech “WAAAAH!!! TL;DR!!!” at anything with more than two sentences.
I’d also argue “semantic ablation” is actually way, way better as a concept than “hallucination”. The later is not quite “additive error”; it’s a misleading metaphor for output that is generated by the model the same way as the rest, but it happens to be incorrect when interpreted by human beings.
As a former linguistics major, I find this to be horseshit.
Really, we optimize for the least possible amount of communication necessary. With a spouse, you don’t ask full questions. Early on, you might have to shoot a look, but later on? This is now ingrained. They’re offering the solution before you express the problem.
To be clear, by “communication” I’m talking about the information conveyed by a certain utterance, while you’re likely referring to the utterance itself.
Once you take that into account, your example is optimising for #2 at the expense of #1 — yes, you can get away conveying info in more succinct ways, but at the expense of requiring a shared context; that shared context is also info the receiver knows beforehand. It works fine in this case because spouses accumulate that shared context across the years (so it’s a good trade-off), but if you replace the spouse with some random person it becomes a “how the fuck am I supposed to know what you mean?” matter.
Sure. That’s a specific use case and not likely a useful one.
When we start getting into utterances, though, we’re firmly in linguistics. Unless you’ve been passing bad checks.
Yeah, got to borrow some word from discourse analysis :-P
It fits well what I wanted to say, and it makes the comment itself another example of the phenomenon: that usage of “utterance” as jargon makes the text shorter and more precise but makes it harder to approach = optimises for #2 and #3 at the expense of #1. (I had room to do it in this case because you mentioned your Linguistics major.)
Although the word is from DA I believe this to be related to Pragmatics; my four points are basically a different “mapping” of the Gricean maxims (#1 falls into the maxim of manner, #2 of manner and relation, #3 of quality, #4 of quantity) to highlight trade-offs.
I never got a degree! I got roped into the college paper, and from there, well, I didn’t really care about my studies. Why worry about semantics and semiotics when you can tell 18,000 people what to think?
(yeah, I meandered into news after cutting my teeth in opinion)
I recently read a lovely short story about this: https://sightlessscribbles.com/the-colonization-of-confidence/
Great.
asking that machine to improve writing is like asking a blender to improve a salad
A counterweight grows as people learn to reappreciate the acerbity of art and words. The ring is poisonous and must be destroyed, but it can unite.
That was awesome







