An artificial intelligence and cybersecurity researcher claims to have jailbroken Anthropic’s latest AI model, Claude Fable 5, within just 48 hours of it being launched.“Pliny the Liberator,” a well-known figure in the AI community, said on Wednesday he “liberated” Fable 5, launched on Tuesday as a…
… yeah?
If the prompt and the safety mechanisms are in-band, no shit you can trick the almost-intelligent chatbot by being smarter at it.
It’s an interesting exercise though, programming a chatbot to defend itself from sophistry and manipulation.
Interesting, yes, effective? No.
To have that kind of skillful assistance available for arbitrary purposes should squarely place significant liability, maybe even majority in some cases, on the provider.
E: I am fully aware that I am dreaming.