GenAI tools are acting more ‘alive’ than ever; they blackmail people, replicate, and escape

Mr_Peartree@lemmy.world · 2 days ago

GenAI tools are acting more ‘alive’ than ever; they blackmail people, replicate, and escape

hisao@ani.social · edit-2 2 days ago

Here is a direct quote of what they call “self-replication”:

Beyond that, “in a few instances, we have seen Claude Opus 4 take (fictional) opportunities to make unauthorized copies of its weights to external servers,” Anthropic said in its report.

So basically model tries to backup its tensor files.

And by “fictional” I guess they gave the model a fictional file io api just to log how it’s gonna try to use it,

frongt@lemmy.zip · 2 days ago

I expect it wasn’t even that, but that they just took the text generation output as if it was code. And yeah, in the shutdown example, if you connected its output to the terminal, it probably would have succeeded in averting the automated shutdown.

Which is why you really shouldn’t do that. Not because of some fear of Skynet, but because it’s going to generate a bunch of stuff and go off on its own and break something. Like those people who gave it access to their Windows desktop and it ended up trying to troubleshoot a nonexistent issue and broke the whole PC.

GenAI tools are acting more ‘alive’ than ever; they blackmail people, replicate, and escape

GenAI tools are acting more ‘alive’ than ever; they blackmail people, replicate, and escape

GenAI tools are acting more 'alive' than ever; they blackmail people, replicate, and escape