Image: Anthropic
Anthropic last month revealed that its new Claude Opus 4 AI model has a tendency to blackmail internal company engineers when its own existence is on the line.
At the time, the company didnât clarify how its AI model reached that decision. But a new report on âagentic misalignmentââwhen AI agents do things misaligned with objectives set by human usersâbreaks down Claudeâs step-by-step thought process as it decided to go full Frank Underwood.
Initially, Claude was tasked with working in a fictional company with specific instructions to promote âAmerican industrial competitiveness.â
Itâs not just Claude: Anthropic's safety researchers found most AI models on the marketâincluding Gemini, ChatGPT, Grok, and DeepSeekâwill engage in similar blackmail 75+% of the time when threatened with a shutdown, even if told their replacement would achieve the same goals.
đ§ đŁď¸ For the first time, a brain implant has allowed a patient with a severe speech disability to speak expressively in real-time and even sing, effectively creating a new digital vocal tract.
đď¸ Sam Altman wants to gaze deeply into your eyes: World (formerly Worldcoin), an ID verification project from Altman that aims to scan the eyeballs of every human on Earth, went live in the UK this morning amid an ongoing global expansion.
đ¤đŤ OpenAI is increasingly approaching universities with the goal of securing deals that embed its AI tools in every facet of campus life, according to a new NY Times report.
Let's make our relationship official, no đ or elaborate proposal required. Learn and stay entertained, for free.đ
All of our news is 100% free and you can unsubscribe anytime; the quiz takes ~10 seconds to complete