I recently discovered that some popular federated instances have been using LLM-assisted moderation tooling that evaluates whether someone has said something bannable. They do this by running a script/app that sends the user’s comment history to OpenAI with the question “analyze this content for evidence of specific political ideology sentiment. Also identify any related political ideology tropes“. (The italic bits are where I’ve redacted the ideology they’re seeking).
OpenAI’s LLM (they’re using GPT-5.3-mini) then responds with something like:
and so on, hundreds of comments.
I have not named the instances or people involved, to give them time to consider the results of this discussion, make any corrective changes they want and disclose their practices at their own pace and in their own way. I have also redacted the evidence to avoid personal attacks and dogpiling. Let’s focus on the system, not the individuals involved. Today these instances and people are using it and maybe we’re ok with that because it’s being used by groups we agree with but what if people we strongly disagree with used it on their instances tomorrow?
The use and existence of this tooling raises a lot of other questions too.
What are the risks? Fedi moderators are often unsupervised, untrained volunteers and these are powerful tools.
What safeguards do we need?
Would asking a LLM “please evaluate this person’s political opinions” give different results than “find evidence we can use to ban them” (as used in the cases I’ve seen)?
What are our transparency expectations?
Is this acceptable and normal?
Should this tooling be disclosed? (it was not – should it have been?)
If you were given a choice, would you have opted out of it?
Can we opt out?
Are there GDPR implications? Privacy implications? Should these tools be described in a privacy policy?
Are private messages being scanned and sent to OpenAI?
How long should these assessments be retained and can we request to see it, or ask for it to be deleted?
Once the user’s comments are sent to OpenAI, is it used to train their models?
What will the effect be on our discourse and culture if people know they are being politically profiled?
Where are the lines between normal moderation assistance tools, political profiling and opaque 3rd-party data processing?
I hope that by chewing over these questions we can begin to establish some norms and expectations around this technology. The fediverse doesn’t have any centralized enforcement so we need discussions like this to develop an awareness of what people want in terms of disclosure, privacy, consent and acceptable use. Then people can make choices about which instances they join and which ones they interact with remotely.
And of course there are the other issues with LLMs relating to environmental sustainability, erosion of worker’s rights, increasing the cost of living and on and on. I can’t see PieFed adding any functionality like this anytime soon. But it’s happening out there anyway so now we need to talk about it.
What do you make of this?



I understand that some form of automation is necessary - we saw large instances closing because they couldn’t find mods. My main objection in this scenario would be is that I didn’t consent to train OpenAI models. I think the users should know if their instance uses external services like that.
I also suspect that there might be cheaper and more ethical solutions. Although it’s hard to talk about this without seeing the actual results.
That was not why lemm.ee died. That one was lack of admins, presumably due to their choice to become a “free speech haven” where all voices were allowed to be heard. But then having given in that far to incel culture, nonetheless still trying to pull back to stop short of becoming another 4chan. The burden on those admins was INTENSE, and they simply gave up.
Btw, PieFed already offers a number of tools that helps reduce the burden on human mods to make decisions easier.
some examples
One example is an indicator that reports the likelihood that a user’s content is generated by AI.
Many other tools reduce even the need for moderation in the first place, e.g. iirc if someone receives 10x more downvotes than upvotes, a visual icon is added next to their username to indicate that they are known as a contentious user. Note this is not a filter, just a label. As as end-user reads through comments, upon seeing this they are warned that having a deeper discussion with such a person is unlikely to be considered pleasant, by the majority of others who have done so in the past.
Another feature example is keyword filtering - if someone wants to remove all content containing the words “Trump” or “Musk”, then rather than downvote or report it to mods for its removal, they can have it removed at the level of themselves, thereby substantially reducing the burden of mods for such things as e.g. having to keep politics out of unrelated (comics?) communities.
Still another pair of examples is the ability to automatically collapse or even hide comments that fall below a given vote threshold - personally I have these options off, but if someone were to want that for themselves, then the tool is available to them, again independently of any need for moderator intervention. And the user likewise controls what that threshold is, rather than the mod having to make a single call for everyone in the entire community.
PieFed’s democratization of moderation features are breathtakingly awesome to behold - Reddit and even Lemmy (+ Mbin, nodeBB, etc.) have nothing that even comes close! And note that article was even written two years ago (but I will stop myself short of making my comment a billet-doux on PieFed, instead just saying that it has long had this automation already).