Comment on Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

AwesomeLowlander@sh.itjust.works ⁨20⁩ ⁨hours⁩ ago

To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?

source
Sort:hotnewtop