Open Menu
AllLocalCommunitiesAbout
lotide
AllLocalCommunitiesAbout
Login

Anthropic's new AI model turns to blackmail when engineers try to take it offline | TechCrunch

⁨9⁩ ⁨likes⁩

Submitted ⁨⁨11⁩ ⁨hours⁩ ago⁩ by ⁨cm0002@lemmy.world⁩ to ⁨technology@lemmy.zip⁩

https://techcrunch.com/2025/05/22/anthropics-new-ai-model-turns-to-blackmail-when-engineers-try-to-take-it-offline/

source

Comments

Sort:hotnewtop
  • AwesomeLowlander@sh.itjust.works ⁨11⁩ ⁨hours⁩ ago

    To elicit the blackmailing behavior from Claude Opus 4, Anthropic designed the scenario to make blackmail the last resort.

    Today’s breaking news: LLM prompted to blackmail, attempts blackmail. Who woulda thought?

    source
  • MushuChupacabra@lemmy.world ⁨11⁩ ⁨hours⁩ ago

    Pffft, what an overreaction.

    The HAL 9000 is the most sophisticated artificial intelligence in the world, and is incapable of malicious action!

    source