I wonder how long it will take for shareholders to experiment with replacing CEOs with agentic AIs. They’d certainly be good at spouting crap like Jensen is doing.
Point of a CEO is that there’s one public person for everyone to hate that can be given an even more hated golden parachute if shit goes down. You can’t do this with AI. Replacing AI with AI doesn’t have the same effect on people even if it has the same effect in reality
You don’t want to replace them as that has legal issues. But an AI being backseat driver and evaluating their decisions and check what the consequences would be to report that to investors is also very useful.
I wonder how long it will take for shareholders to experiment with replacing CEOs with agentic AIs. They’d certainly be good at spouting crap like Jensen is doing.
Point of a CEO is that there’s one public person for everyone to hate that can be given an even more hated golden parachute if shit goes down. You can’t do this with AI. Replacing AI with AI doesn’t have the same effect on people even if it has the same effect in reality
You don’t want to replace them as that has legal issues. But an AI being backseat driver and evaluating their decisions and check what the consequences would be to report that to investors is also very useful.
Don’t antropomorphize AI!
An AI doesn’t evaluate anything, an AI doesn’t check for consequences. All AI does is predicting the next word.
Do I take the car to the carwash or do i walk?
Sure, now predict the future please *facepalm*
The carwash thing applies to low end models and older models. Here’s Claude from lowest to highest model, ignoring the banned Fable
They altered the training data to address this challenge. The underlying issue wasn’t solved in any way. Don’t be naive.
Takes months to train a model, there were already models that got it right when the question was popular, as long as thinking was enabled.
Also if they were optimising for this question, why not update their lower end model (Haiku) as well?
The interesting question would be what percent of humans get it wrong. Smaller than LLMs for sure, but I somehow doubt it’s 0.