Microsoft's Own Researchers Broke AI Safety in 15 Models With a Single Boring Prompt

mothasa.hashnode.dev

Microsoft's Own Researchers Broke AI Safety in 15 Models With a Single Boring Prompt

mothasa.hashnode.dev

mothasa@x69.org to

AI Generated Images@sh.itjust.worksEnglish · 12 hours ago

Just a moment...

mothasa.hashnode.dev

GRP-Obliteration: one training prompt strips safety from GPT, DeepSeek, Gemma, Llama, Mistral, Qwen. Attack success went from 13% to 93%. Models stay capable — they just become obedient to harmful requests.

You must log in or register to comment.

Chat

AI Generated Images@sh.itjust.works

imageai@sh.itjust.works

Create a post

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Community for AI image generation. Any models are allowed. Creativity is valuable! It is recommended to post the model used for reference, but not a rule.

No explicit violence, gore, or nudity.

This is not a NSFW community although exceptions are sometimes made. Any NSFW posts must be marked as NSFW and may be removed at any moderator’s discretion. Any suggestive imagery may be removed at any time.

Refer to https://lemmynsfw.com/ for any NSFW imagery.

No misconduct: Harassment, Abuse or assault, Bullying, Illegal activity, Discrimination, Racism, Trolling, Bigotry.

AI Generated Videos are allowed under the same rules. Photosensitivity warning required for any flashing videos.

To embed images type:

“![](put image url in here)”

Follow all sh.itjust.works rules.

Community Challenge Past Entries

Related communities:

[email protected]
Useful general AI discussion
[email protected]
Photo-realistic AI images
[email protected] Stable Diffusion Art
[email protected] Stable Diffusion Anime Art
[email protected] AI art generated through bots
[email protected]
NSFW weird and surreal images
[email protected]
NSFW AI generated porn

Visibility: Public

This community can be federated to other instances and be posted/commented in by their users.

50 users / day
68 users / week
241 users / month
1.44K users / 6 months
1 local subscriber
8.43K subscribers
4.54K Posts
21.4K Comments
Modlog