OpenAI confirms that AI writing detectors don’t work

fne8w2ah@lemmy.world · 3 年前

OpenAI confirms that AI writing detectors don’t work

Leate_Wonceslace@lemmy.dbzer0.com · 3 年前

I feel like this must stem from a misunderstanding of what 26% accuracy means, but for the life of me, I can’t figure out what it would be.

dartos@reddthat.com · edit-2 3 年前

Looks like they got that number from this quote from another arstechnica article ”…OpenAI admitted that its AI Classifier was not “fully reliable,” correctly identifying only 26 percent of AI-written text as “likely AI-written” and incorrectly labeling human-written works 9 percent of the time”

Seems like it mostly wasn’t confident enough to make a judgement, but 26% it correctly detected ai text and 9% incorrectly identified human text as ai text. It doesn’t tell us how often it labeled AI text as human text or how often it was just unsure.

EDIT: this article https://arstechnica.com/information-technology/2023/07/openai-discontinues-its-ai-writing-detector-due-to-low-rate-of-accuracy/

schzztl@lemmy.nz · 3 年前

Specificity vs sensitivity, no?

cmfhsu@lemmy.world · edit-2 3 年前

In statistics, everything is based off probability / likelihood - even binary yes or no decisions. For example, you might say “this predictive algorithm must be at least 95% statistically confident of an answer, else you default to unknown or another safe answer”.

What this likely means is only 26% of the answers were confident enough to say “yes” (because falsely accusing somebody of cheating is much worse than giving the benefit of the doubt) and were correct.

There is likely a large portion of answers which could have been predicted correctly if the company was willing to chance more false positives (potentially getting studings mistakenly expelled).