I didn't realize it was so bad

The Picard Maneuver@lemmy.world · 1 day ago

I didn't realize it was so bad

Cris_Citrus@piefed.zip · edit-2 21 hours ago

As funny as this is, Gemini was essentially broken on release. For context this is a current response

Its fun to laugh at how stupid it was, but the fact its gotten better honestly just increases the risk that someone will believe a hallucination and bad information proliferates because its embedded in largely good information making it appear trustworthy, and if you don’t know the answer then you dont know when it’s confidently wrong. And it increases the risk that AI usage will grow as people decide they think its helpful, with harmful implications for the environment and labor.

It would frankly be nice if had stayed that stupid, it would be much less harmful that way

Furbag@pawb.social · 2 hours ago

I still get hilariously bad, wildly inaccurate or false responses from search engine AI.

If the answer isn’t in the training data or easily searchable on the web, it will make shit up and lie to you with the confidence of a used car salesman.

It’s not always this egregious, but it can be. I still see AI messing up the “how many letters are in this word” prompts years after I first discovered that it was a thing.

KSP Atlas@sopuli.xyz · edit-2 8 hours ago

I’ve had the duckduckgo ai (which I have since disabled) relatively recently hallucinate that the walking speed of a game character was like mach 4.5

And it considered that speed “slow+”

KSP Atlas@sopuli.xyz · 8 hours ago

deleted by creator

Spice Hoarder@lemmy.zip · 21 hours ago

Google models always suck ass, I don’t get why people ever glaze it. Or google in general.

Cris_Citrus@piefed.zip · edit-2 21 hours ago

Yes its my understanding that other models are generally considered much better. I don’t think I’ve ever really seen anyone glaze Gemini though. If you’re talking about my comment you’re replying to, I’m not sure I’d agree that saying its now capable of giving a vaguely competent and accurate sounding answer that isnt complete garbage is glazing 😅

Do you see folks aside from google marketing people hyping up Gemini in comparison to other models? (I dont follow things super closely so I may be kinda out of the loop)

Gork@sopuli.xyz · 19 hours ago

Gemini is near the bottom of the pack for me personally. It once suggested to me that my gasoline powered lawn mower doesn’t need an oil change 🙄

Cris_Citrus@piefed.zip · 13 hours ago

Out of curiosity, which do you think produces the most helpful outputs? I care a lot about how technology harms or helps people, and honestly the more capable ai gets the more I’m concerned that without regulatory guard rails its going to do an incredible amount of harm. So I’ve tried to at least keep up with vaguely where its at, but I’ve mostly just used chatGPT, though I’ve tried to limit my usage cause I don’t like the way it feels like it impacts me mentally

Mostly it has seemed better at finding burried information than a search engine, but very unreliable for certain other kinds of tasks. Weirdly it has seemed difficult to predict what kinds of tasks it will perform well, and which it’ll butcher

Gork@sopuli.xyz · edit-2 3 hours ago

I haven’t tried Claude or Grok so I can’t speak to their outputs, other than the latter being known for it’s Mecha-Hitlerness. ChatGPT can simultaneously be smart and yet super dumb at times, and Alexa is pretty aggressive with marketing products off Amazon (go figure). Microsoft Copilot is not very useful because it can’t do stuff that you’d expect an AI integrated into an OS is capable of. For instance, I can’t just tell it to batch rename stuff inside Windows Explorer without having to do through a PowerShell script that I would have to execute (which any LLM can generate anyway).

I’ve mostly been using it to help me make 3D models, surprisingly enough. It is pretty capable with OpenSCAD with a decent amount of hand holding. But again it can be dumb (I had to explain to it how a handle works lol).

I only use it sparingly at my actual job and only as a double check or grammar check as it more frequently makes mistakes that are less obvious to catch than in 3D modeling.