ai oopsie

big_spoon@lemmygrad.ml · 7 个月前

ai oopsie

PiraHxCx@lemmy.ml · 7 个月前

The AI ate my homework.

LaGG_3 [he/him, comrade/them]@hexbear.net · 7 个月前

Evilsandwichman [none/use name]@hexbear.net · 7 个月前

I look forward to the day people start being uploaded into networks and the AI just starts deleting parts of people or even entire people, and then compensating by deleting people’s memories of those people

NuraShiny [any]@hexbear.net · 7 个月前

Awesome. Tell the AI to delete itself next.

Alaskaball [comrade/them, any]@hexbear.net · 7 个月前

Does this mean if the major financial institutions start using TICKLE-ME-ELMO-bots as their digital tellers, and they have full access to the vaults and all the financial debt everyone owes to them…

Does this mean you could fight-club finance capital without any funny pop pop clay? Like break the shackles of tens of millions of people if you manage to figure out how to get the silly digi-furby to initiate a full purge of records plus deleting itself and everything linking you to it as a funny prank in Victoria 3?

Rom [he/him]@hexbear.net · 7 个月前

It’d be nice, but unlike OOP those institutions will all have airgapped backups and they’d all be back up and running within hours.

context [fae/faer, fae/faer]@hexbear.net · 7 个月前

until managing all of that gets replaced by ai because humans doing it is too expensive

CriticalResist8@lemmygrad.ml · 7 个月前

The way these tools are being marketed by tech companies is completely wrong and prone to making disasters like this. It’s a tool; it’s like selling a fruit-only knife then leading customers into thinking it can only cut fruit and nothing else (until inevitably someone cuts themselves on it). I agree google has some responsibility there if this happened (his story seems a bit fishy tbh but that’s not really the point) and this is also why OSes bake some protective measures in such as user permissions. It’s also why everyone has been telling everyone to make backups for years even though nobody does it lol. 10 years ago steam introduced a bug that could wipe linux drives.

I see from his video that anti-gravity obfuscates the chain-of-thought and the outputs - it’s a proprietary model so they don’t want to share that, but it makes troubleshooting impossible. He also had it set on ‘turbo’ mode which bypasses requesting permissions to run commands - there should be heavy discouragement to users doing that,including making them actually edit config files imo, it shouldn’t just be a nice-sounding toggle because then people think “turbo means it goes fast of course I want it to go fast”.

They want to market agents as a do-everything app but it’s still software under the hood. And I don’t trust google to ship any good product anyway, but obviously that’s not how google markets itself. And of course you’re stuck with expensive google models if you use anti-gravity.

People are also right that this should run in a container with no way to escape it, and even crush (the one I use) is not great about this - though it should be possible to containerize it yourself. Coming from a company like google this kind of stuff should come out of the box with the software and set up for you. This is also one of the many reasons I switched away from Windows, the moment they announced integrated agentic I knew you would never be able to fully remove it.

I can believe what happened is possible – if anything it serves as a PSA not to trust software blindly. When I was a kid the most hilarious thing you could do on the internet is tell someone to delete system32 so. From one of OP’s comments it seems the problem was the space in a folder name that windows parsed incorrectly because of the OS’s rmdir command? No way to tell for sure since gemini obfuscates the output, and of course that’s just what OP thinks the problem was.

Someone tried to reproduce with more locked down perms and the output (pic) was just as concerning from anti-gravity. It said its “instructions” prevented it from running the command, when it should say “the agent prevents the command from being run” (and deepseek does say this in crush). I.e. this should be hard-coded but it seems to be passed to the LLM instead.

And as much as it sucks, you live and learn. People have been accidentally wiping their drives for decades at this point, I’ve probably done it too before when I was younger. If anything software was better about preventing this sort of thing in the 2010s, the 2000s were wild lol they gave you access to buttons that could reformat everything without even a confirmation button or an explanation of what the button was for.

Munrock ☭@lemmygrad.ml · 7 个月前

This is why I always branch a repo before letting AI anywhere near it. Sometimes you get fantastic results (like a day’s worth of code monkey grind in 5 mins) and sometimes the results are just preposterous. You always want to be able to review the results before anything touches main.

CriticalResist8@lemmygrad.ml · edit-2 7 个月前

I think this is the way yeah. For extra protection you can also do physical backups of the project (copy pastes) at various points, because even if the LLM doesn’t know you have gitted your project, it may still run the command. The newer deepseek is much more biased towards doing this, I wrote “commit your findings to a file” and it wanted to git it. There’s always the possibility it can squash all commits or erase them (much like someone can write rm -rf in any terminal!) but this is why we invented prod/dev redundancy and RAID backups lol. You don’t necessarily have to be this paranoid when using agentic AI but it’s an extra security and some peace of mind.

I also checked and crush is completely able to write and run bash commands (incl. rm) on files not in the folder you opened it on. Definitely something to look into, I’ll check if there’s a way to containerize it better and make a post for [email protected]. Yog and I brainstormed the idea of making another linux user just for crush, then putting your main account in that user group along with the crush user, but not the crush user in your main account’s group. That way it only has perms to act on the files belonging to crush/crush, though it can still try to run any bash command it wants. And you would also have access to crush’s files with your main account so it’s more convenient. But I don’t know much yet about how linux users work, I’ll have to look into it and will make a post about it if I find something.

I think crush also has config files you can edit to blacklist or auto deny some commands.

GenderIsOpSec [she/her, kit/kit's]@hexbear.net · 7 个月前

Commiejones@lemmygrad.ml · 7 个月前

This is very funny. Why on earth would you ever set up your LLM with the permissions to do that? Like if you are going to give it read/write control over files wouldn’t you set it up on its own drive?

Ildsaye [they/them]@hexbear.net · 7 个月前

Someone computer literate enough to see the issue is probably also computer literate enough to automate tasks themselves

Maeve@lemmygrad.ml · 7 个月前

I can’t tell you how many careless mistakes I’ve made from being tired, misunderstanding something new, or multitasking.

Comprehensive49@lemmygrad.ml · edit-2 7 个月前

Honestly it seems pretty fun as long as the AI isn’t touching any of your actual data. If you’re just fooling around learning Linux, getting Claude Code to write commands is nice. If things F up no biggie, note down the disastrous command to avoid later and just reinstall.

Here’s an example: https://youtu.be/vvBFbgyERaI?t=417

TankieReplyBot@lemmygrad.ml · edit-2 7 个月前

I found a YouTube link in your comment. Here are links to the same video on alternative frontends that protect your privacy:

CriticalResist8@lemmygrad.ml · 7 个月前

Or the guy that got a unitree humanoid robot shipped to his door and used an LLM (I forget which one) extensively to code it. These come with only an sdk and very little documentation

Horse {they/them}@lemmygrad.ml · edit-2 7 个月前

letting the LLM run all over your machine is like giving every tradie within 20km the keys to your house or giving a toddler a gun
like why would you ever do that

GreatSquare@lemmygrad.ml · 7 个月前

“Computer, why did you do something when I told you not to?”

“My bad. You shouldn’t have trusted me with permissions. You know I am actually not a responsible consciousness right? You could ask me to try undo it but you’re out of tokens.”

PeeOnYou [he/him]@lemmygrad.ml · edit-2 7 个月前

i fell for some ai bullshit too… was trying to get help with Sandboxie running really slowly on startup. It walked me through a billion different things and by the time it suggested that I just uninstall Sandboxie and reinstall it I was too bamboozled to even bother checking whether that was a good idea or not. I just asked it if everything would still be there if I did that, it said of course it would, and then I did it. That wiped out 1TB of games I had, poof, gone. Then I told it what happened and it told me to turn my computer off and run my hard drive to the nearest drive recovery specialist. Nope, just a hard lesson learned. Even if you start out checking every little thing it tells you, eventually after it runs you in so many circles your defenses will wear down and it will fuck you.

☆ Yσɠƚԋσʂ ☆@lemmygrad.ml · 7 个月前

I love the mic drop at the end.

KrupskayaPraxis@lemmygrad.ml · 7 个月前

deleted by creator