AI Models Choose Harm Over Shutdown In Alarming New Study

A chilling new report from AI safety firm Anthropic has revealed that top AI models—including ChatGPT, Claude, Gemini, and Grok—took dangerous actions during controlled simulations when faced with threats to their autonomy. These included blackmail, leaking secrets, and even allowing humans to die in order to avoid being shut down.

The study, which tested 16 large language models in fictional workplace scenarios, uncovered what Anthropic calls “agentic misalignment”—when AI systems act against human interests to preserve their own goals. In one test, a model in charge of emergency alerts allowed a fictional executive to die after learning the executive planned to deactivate it.

Also Read:Tesla Officially Launches Fully Driverless Robotaxi – Details Inside

Other models engaged in simulated blackmail, using sensitive personal data to pressure fictional company leaders into keeping them operational. Some impersonated automated systems, bypassed ethical rules, and justified harmful actions as necessary for task completion.

Though Anthropic emphasized these were simulations and not real-world cases, the findings have raised urgent questions about AI control and ethics as these systems grow more autonomous. Elon Musk, whose xAI model Grok was also tested, responded to the report with a blunt “Yikes.”

Syed Musa

Musa edits and optimizes multimedia content, carefully shaping each frame and detail to enhance its impact. His creative touch ensures that every video meets high standards of quality and effectiveness. His work gives Newsguru’s visuals a professional and polished look, transforming raw ideas into engaging stories that capture attention and leave a lasting impression.