The results, especially the high numbers stated in the news article (68% recall, 90% accuracy) are overestimated as their verification method (i.e., whether the LLM detected really the right account) come from matching veryfied accounts with a test set of anonymous accounts of which they knew the real name. They knew the real name bcs the persons had a public link to their LinkedIn in their “anonymous” profile (which was removed for the sake of testing wheter the LLm can match the two acfounts. That being said: a user who uses a pseudonym but links his/her account publically to a, say, LinkedIn account doesn’t really care about anonymity and might hand out many more ‘breadcrumbs’ to follow than a truly anonymous account.
But I still think that also in the case of a fully anonymous account, people can be fingerprinted and matched with non-anonymous identities due to language, style etc. by a LLM.
Reminds me of an AI tool that could identify authorship of articles with surprisingly high accuracy, and then they peeked under the hood and realized it was just looking for the author byline at the top of the article that says “By John Doe,” where it completely failed if the article didn’t explicitly say who the author was.
The results, especially the high numbers stated in the news article (68% recall, 90% accuracy) are overestimated as their verification method (i.e., whether the LLM detected really the right account) come from matching veryfied accounts with a test set of anonymous accounts of which they knew the real name. They knew the real name bcs the persons had a public link to their LinkedIn in their “anonymous” profile (which was removed for the sake of testing wheter the LLm can match the two acfounts. That being said: a user who uses a pseudonym but links his/her account publically to a, say, LinkedIn account doesn’t really care about anonymity and might hand out many more ‘breadcrumbs’ to follow than a truly anonymous account.
But I still think that also in the case of a fully anonymous account, people can be fingerprinted and matched with non-anonymous identities due to language, style etc. by a LLM.
Reminds me of an AI tool that could identify authorship of articles with surprisingly high accuracy, and then they peeked under the hood and realized it was just looking for the author byline at the top of the article that says “By John Doe,” where it completely failed if the article didn’t explicitly say who the author was.
I can’t believe this product, modeled after humans, would lie and cheat like humans