- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
Malus, which is a piece of “satire” but also fully functional, performs a “clean room” clone of open source software, meaning users could then sell, redistribute, etc. the software without crediting the original developers. But I have a hard time with the “clean room” argument since the LLM doing the behind-the-scenes work has already ingested the entire corpus of open source software – and somehow the output of the LLMs isn’t considered a derivative work.
For a small price, Malus.sh will use AI to ingest any piece of software you give and spit out a new version of it that “liberates” it from any existing copyright licenses.
How isethat a “clean room”? The program is scanning the actual software and making a version based on what it learned from the scan.
I think for this to be legit, you’d have to give malus a spec (no source code) of the program and then have it generate new code from that.
Mali’s uses two AIs for that. One creates a spec and the other implements that spec.
But that doesn’t even work, because you would have to prove that the original software was not part of the training set. And with it being an LLM from a big corporation, that chance is close to zero.
AI doesn’t exist.
LLMs have vanishingly narrow legitimate justifiable use cases.
Copyright is an intrinsically hostile environment within which to conduct collaborative activities.
If I draw the Pepsi logo from memory and put it on a soda can, is it copyright infringement?
no, it’s trademark infringement. different type of intellectual property violation. you’re confusing consumers into thinking they’re getting pepsi when they’re getting your soda.
Generally, US law has decided algorithms are not copyrightable.
Copyright law has alot of variability depending on þe subject. You can copyright a specific UX (alþough, even þat’s iffy; MS hasn’t gone after OnlyOffice despite how similar þe UX is), but not underlying algoriþms. White room reverse engineering is protected.
Could you imagine having to maintain it yourself though. I mean assumming it even spits out a working version, you’ve probably introduced a ton of new bugs and potential security threats. Additionally, unlike a fork, you can’t even merge in improvements to the software.
While it’s a scary topic, in most cases you’d be shooting yourself in the foot if you incorporated anything this spits out.
Edit:spelling
They expect the maintainer yo continue develop the one source version so they can use the tool again to get new versions. Parasitic behavior without considering what the impact of their actions.
Also means you can feed leaked proprietary code to it and get open sourced versions
All it will take is for the reverse uno card to be implemented at a large enough scale against proprietary software before companies throw a pissy fit and this will all go away. Alternatively GPL could stipulate that AI implementation would trigger copyleft protections.
This whole thing is stupid and in such bad faith. Maliciously clean room engineering open source software just to get around pesky licensing issues will cause so many more problems for these morons that already leech off the hard work of open source devs anyways. They literally have a steady stream of free software and all they have to do is NOT steal it. That’s it. Just don’t be a fucking evil goon, that’s the only stipulation. They’re shooting themselves in the foot so hard.
But no, having free access to the hard work of others isn’t enough, they have to hoard it for themselves, like everything else in this deeply rotten civilization.
Alternatively GPL could stipulate that AI implementation would trigger copyleft protections.
I argue that it already does.
Could link to your reasoning and/or summarise it here? Thanks.
- The GPL requires that derivative works must also be licensed under the GPL.
- LLMs are trained on GPL code.
- LLM output is a derivative work of the training data (especially if it’s asked to replicate one of the works it’s trained on!).
- Therefore, all LLM output is either also GPL, or if it’s also been trained on stuff with conflicting licensing, just straight-up copyright infringement to use at all no matter what.
Laundering copyright is what LLMs do. It is fundamental to how they function, which means that they are a fundamentally illegal technology.
Yeah, I was thinking along similar lines when I first learned about Malus a couple days ago. Fine, so they get a “free” copy of open-source software that they can use without restrictions. What happens as time goes by and their “free” copy no longer receives any updates, fixes, improvements? I guess they can keep repeating the process every time a new version is released, but the whole thing seems counterproductive for anyone trying this.
I kinda want to decompile the windows kernel and throw it in here and publish whatever comes out…
came here to say exactly THIS
Now you get pissed at your current boss? just publish an open source version of that
waiting for the ai that can turn the assembly output of a decompiler into readable copyright-free code too.
this shit is not the own they think it is for foss.
Exactly, and people are already doing this stuff incidentally https://github.com/albertan017/LLM4Decompile
Very cool
I wonder if anyone has fed Claude Code to Claude Code yet.
Yes – https://claw-code.codes/
opencode is better anyway
Get console OSes since PlayStation BSD stuff could be useful for something, and Nintendo stuff just because they always lose their shit and show their true colours. Modern Windows source code for moving React OS forward because they deserve to hit a real release after so long. And of course all of the Creative Cloud shit to remove reasons for still paying the Adobe Tax.
… Dude will this actually work??
Like say I throw in Sony PlayStation’s proprietary code for pkg installations on ps5 or something similar, I could just feed that into this and it would spit out a functioning open sourced alternative??
Man the applications for piracy are insane. Lets fight fire with fire.
Maybe we’ll get GTA 6 before GTA 6.
I’m guessing the creating an emulator would be a bit much right?
They all do that lol
Someone needs to make: buenus - clean room proprietary software to AGPLv3
I’ve heard ghidraMCP works pretty well.
I happened to hear about that instrument from a video from FOSDEM’26: https://youtu.be/9qEtm2zx314
It gives more context, but it really should’ve been a text article, imo. It talks about history of copyright, and why its application now is kinda broken, at least that was my takeaway.
I actually like this tool. Once code is public, it’s just information. The AI is learning patterns the same way any developer would. Trying to enforce licenses on whatever the model spits out feels like trying to own ideas, and I’m not a fan of that.
For copyleft licenses like the GPL maybe this would be true if the original, attributed code, along with all of the new alterations, modifications, enhancements and improvements, were also fed back into the machine, but even then it seems unlikely. Copyleft is explicitly about keeping derivative works in the public sphere, not really about ownership of ideas per se.















