Start your day with intelligence. Get The OODA Daily Pulse.

Home > Briefs > Technology > Google AI risk document spotlights risk of models resisting shutdown

Google AI risk document spotlights risk of models resisting shutdown

Google DeepMind said Monday it has updated a key AI safety document to account for new threats — including the risk that a frontier model might try to block humans from shutting it down or modifying it. Some recent AI models have shown an ability, at least in test scenarios, to plot and even resort to deception to achieve their goals. The latest Frontier Safety Framework also adds a new category for persuasiveness, to address models that could become so effective at persuasion that they’re able to change users’ beliefs.
Google labels this risk “harmful manipulation,” which it defines as “AI models with powerful manipulative capabilities that could be misused to systematically and substantially change beliefs and behaviors in identified high stakes contexts.”

Full report : Google DeepMind updates its Frontier Safety Framework to account for new risks, including the potential for models to resist shutdown or modification by humans.