Archived link: https://archive.ph/Vjl1M
Here’s a nice little distraction from your workday: Head to Google, type in any made-up phrase, add the word “meaning,” and search. Behold! Google’s AI Overviews will not only confirm that your gibberish is a real saying, it will also tell you what it means and how it was derived.
This is genuinely fun, and you can find lots of examples on social media. In the world of AI Overviews, “a loose dog won’t surf” is “a playful way of saying that something is not likely to happen or that something is not going to work out.” The invented phrase “wired is as wired does” is an idiom that means “someone’s behavior or characteristics are a direct result of their inherent nature or ‘wiring,’ much like a computer’s function is determined by its physical connections.”
It all sounds perfectly plausible, delivered with unwavering confidence. Google even provides reference links in some cases, giving the response an added sheen of authority. It’s also wrong, at least in the sense that the overview creates the impression that these are common phrases and not a bunch of random words thrown together. And while it’s silly that AI Overviews thinks “never throw a poodle at a pig” is a proverb with a biblical derivation, it’s also a tidy encapsulation of where generative AI still falls short.


Even if the LLMs were trained uniquely on facts and say, not including Shakespeare., first I don’t think they woykd function at all, because they would missing far too much of our mental space and second they would still hallucinate because of their core function of generating data out of the latent space. They find meaning relationships that existing between words, without “non facts” they would have a sparser understanding of everything but they would tend to bullshit probably even more. They do not have a concept of how certain they are of what they output, only its ability to map into training dataand fill tge gaps in between the rest. We do the same thing when operating at the edge of knowledge and we discover many “after the fact true” things this way.
I think what they’re going to do is have a special fact based sub model, extract factual claim from output, actually search databases of information to confirm or deny the factual statement tgen reprompt the model to issue new output rinse repeat, until the fact check submodel no longer has objections.
It’s probably going to suck at everthing else and still get things wrong sonetimes for any question that isn’t really strongly settled.