Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer

codeinabox@programming.dev · 2 months ago

Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer

dejected_warp_core@lemmy.world · edit-2 2 months ago

When writing code, I don’t let AI do the heavy lifting. Instead, I use it to push back the fog of war on tech I’m trying to master. At the same time, keep the dialogue to a space where I can verify what it’s giving me.

Never ask leading questions. Every token you add to the conversation matters, so phrase your query in a way that forces the AI to connect the dots for you
Don’t ask for deep reasoning and inference. It’s not built for this, and it will bullshit/hallucinate if you push it to do so.
Ask for live hyperlinks so it’s easier to fact-check.
Ask for code samples, algorithms, or snippets to do discrete tasks that you can easily follow.
Ask for A/B comparisons between one stack you know by heart, and the other you’re exploring.
It will screw this up, eventually. Report hallucinations back to the conversation.

About 20% of the time, it’ll suggest things that are entirely plausible and probably should exist, but don’t. Some platforms and APIs really do have barn-door-sized holes in them and it’s staggering how rapidly AI reports a false positive in these spaces. It’s almost as if the whole ML training stratagem assumes a kind of uniformity across the training set, on all axes, that leads to this flavor of hallucination. In any event, it’s been helpful to know this is where it’s most likely to trip up.

Edit: an example of one such API hole is when I asked ChatGPT for information about doing specific things in Datastar. This is kind of a curveball since there’s not a huge amount online about it. It first hallucinated an attribute namespace prefix of data-star- which is incorrect (it uses data- instead). It also dreamed up a JavaScript-callable API parked on a non-existent Datastar. object. Both of those concepts conform strongly to the broader world of browser-extending APIs, would be incredibly useful, and are things you might expect to be there in the first place.

clif@lemmy.world · 2 months ago

My problem with this, if I understand correctly, is I can usually do all of this faster without having to lead a LLM around by the nose and try to coerce it into being helpful.

That said, search engines do suck ass these days (thanks LLMs)

dejected_warp_core@lemmy.world · 2 months ago

That’s been my biggest problem with the current state of affairs. It’s now easier to research newer tech through an LLM than it is to play search-result-wack-a-mole, on the off chance that what you need is on a forum that’s not Discord. At least an AI can mostly make sense of vendor docs and extrapolate a bit from there. That said, I don’t like it.

Feyd@programming.dev · 2 months ago

People will literally do anything to avoid rtfm

xthexder@l.sw0.com · 2 months ago

It’s a struggle even finding the manual these days if you don’t already know where it is / what it’s called. I was searching about an issue with my car recently and like 90% of the results are generic AI-generated “How to fix ______” with no actual information specific to the car I’m searching for.

boonhet@sopuli.xyz · edit-2 2 months ago

I searched up a video to replace a part on my car. I did find it, but I also found 15 videos that were AI generated product reviews of the part.

I definitely also want my car parts to be “sleek and stylish” when hidden away under a plastic cover under the hood lmao

SleeplessCityLights@programming.dev · 2 months ago

I like your strategy. I use a system prompt that forces it to ask a question if there are options or if it has to make assumptions. Controlling context is key. It will get lost if it has too much, so I start a new chat frequently. I also will do the same prompts on two models from different providers at the same time and cross reference the idiots to see if they are lying to me.

dejected_warp_core@lemmy.world · 2 months ago

I use a system prompt that forces it to ask a question if there are options or if it has to make assumptions

I’m kind of amazed that even works. I’ll have to try that. Then again, I’ve asked ChatGPT to “respond to all prompts like a Magic 8-ball” and it knocked it out of the park.

so I start a new chat frequently.

I do this as well, and totally forgot to mention it. Yes, I keep the context small and fresh so that prior conversations (and hallucinations) can’t poison new dialogues.

I also will do the same prompts on two models from different providers at the same time and cross reference the idiots to see if they are lying to me.

Oooh… straight to my toolbox with that one. Cheers.

SleeplessCityLights@programming.dev · 2 months ago

I forgot another key. The code snippets they give you are bloated and usually do unnecessary things. You are actually going to have to think to pull out the needed line(s) and clean it up. I never copy paste.

VoterFrog@lemmy.world · 2 months ago

I find it best to get the agent into a loop where it can self-verify. Give it a clear set of constraints and requirements, give it the context it needs to understand the space, give it a way to verify that it’s completed its task successfully, and let it go off. Agents may stumble around a bit but as long as you’ve made the task manageable it’ll self correct and get there.

Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer

Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer

Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer | Fortune