Honestly I don’t get how AI isn’t rolling backwards already. Image sites are burried in AI slop. Social media posts are burried in AI slop, and now e-mails, that were probably written by AIs. How is AI even remotely improving right now, when obviously 90% of any new training data it’s getting, was generated by the last generation of AI.
Companies that build large LLMs have already said that this is becoming a problem. They’re running out of high-quality human-written content to train their models.
Google paid Reddit to get access to their data to train their models, which is probably why their AI can be a bit dumb at times (and of course, the users that actually contributed the content don’t get any of that money)
that’s true, but I think it’s in the phrasing, they describe it as a shortage of human made content. the bigger issue to note is the lack of ability to identify human made content. IE you give it reddit and our e-mails, there’s plenty of human made content on there… but nobody knows what percentage of it is actually bots or AIs.
Honestly I don’t get how AI isn’t rolling backwards already. Image sites are burried in AI slop. Social media posts are burried in AI slop, and now e-mails, that were probably written by AIs. How is AI even remotely improving right now, when obviously 90% of any new training data it’s getting, was generated by the last generation of AI.
From what I’ve been hearing, AI has indeed been getting worse, not better. I think I read this in relation to ChatGPT 5 compared to previous models.
Companies that build large LLMs have already said that this is becoming a problem. They’re running out of high-quality human-written content to train their models.
Google paid Reddit to get access to their data to train their models, which is probably why their AI can be a bit dumb at times (and of course, the users that actually contributed the content don’t get any of that money)
https://en.wikipedia.org/wiki/Model_collapse
that’s true, but I think it’s in the phrasing, they describe it as a shortage of human made content. the bigger issue to note is the lack of ability to identify human made content. IE you give it reddit and our e-mails, there’s plenty of human made content on there… but nobody knows what percentage of it is actually bots or AIs.
Ai is inbred.