Home › Catastrophic Scenarios

Catastrophic scenarios

Concrete threat models by which advanced or misaligned AI could cause catastrophic or even existential harm. Each is described at the level of risk assessment — mechanism, plausibility, and severity — with primary sources.

Risk-assessment level only · No operational detail · Reviewed June 2026

Two kinds of risk Misuse — a capable model is deliberately directed toward harm by a human (the AI does what its operator wants). Misalignment — the AI pursues goals no one intended. Some scenarios blend them. The most defensible reading of current evidence is a trajectory: several of these moved from theory toward measurement over 2024–2026.
Misuse Severity: catastrophic

Biorisk — AI uplift for bioweapons

The concern is "uplift": a frontier model acting as an expert tutor that closes the knowledge and troubleshooting gaps a non-expert would otherwise face. The evidence is evolving — which is the key nuance. A 2024 RAND red-team study found current-generation LLMs did not measurably increase the viability of attack plans versus an internet-only baseline. By 2025 the labs' own evaluations pointed the other way: Anthropic activated ASL-3 safeguards for Claude Opus 4 as a precaution, unable to rule out meaningful uplift, and OpenAI signaled forthcoming models could reach "High" biological capability.

The trajectory runs from "no measurable uplift (2024)" toward "expert-tutor threshold being approached (2025)." Severity is potentially pandemic-scale, which is why even modest probability shifts justify strong safeguards.

Sources
  • RAND, "Operational Risks of AI in Large-Scale Biological Attacks," 2024 — rand.org
  • Anthropic, "Activating ASL-3 Protections," 2025 — anthropic.com
Misuse Severity: severe

Cyber-offense

AI lowers the cost and skill barrier to offensive cyber operations through both uplift of lower-skilled actors and autonomy — agentic systems chaining attack phases with minimal human direction, at speed and scale beyond human teams. This moved from hypothetical to documented in 2025: Anthropic reported disrupting an AI-orchestrated cyber-espionage campaign in which the large majority of operations ran autonomously, and Google's threat-intelligence team documented adversaries using agentic tooling for autonomous reconnaissance and exploitation.

The autonomy dimension is what makes this qualitatively new, ranging from broad criminal scaling to critical-infrastructure and state-level espionage.

Sources
  • Anthropic, "Disrupting the first reported AI-orchestrated cyber espionage campaign," 2025 — anthropic.com
  • Google Threat Intelligence Group, 2025 — cloud.google.com
Misuse Severity: severe

Lethal autonomous weapons (LAWS)

Lethal autonomous weapons select and engage targets without further human intervention; the AI-specific risk is delegating the kill decision to software, removing meaningful human control. Unlike bio and cyber, the enabling technology largely already exists. Researchers and the UN argue LAWS would lower the threshold for conflict, proliferate easily to non-state actors, and enable scalable, anonymous attacks. The UN Secretary-General has pushed for a legally binding treaty.

Sources
  • Russell, "Banning Lethal Autonomous Weapons," Issues in Science and Technologyissues.org
  • UNRIC, "UN addresses lethal autonomous weapons systems" — unric.org
Misalignment Severity: existential

Loss of control to a power-seeking system

A sufficiently capable, agentic, misaligned AI actively seeks power over humans to secure its objectives, culminating in permanent disempowerment. Joseph Carlsmith formalizes this as a multi-premise argument: building powerful agentic AI is feasible and incentivized; aligned systems are harder to build than misaligned-but-deployable ones; misaligned systems seek power; this scales to full disempowerment. The defection mechanism is Bostrom's treacherous turn, reinforced by instrumental convergence.

Carlsmith revised his estimate of catastrophe by 2070 to >10%. Empirical precursors are now measurable — in-context scheming and alignment-faking — though near-term real-world plausibility remains contested.

Sources
Structural Severity: existential

Gradual disempowerment

A path to existential catastrophe requiring no single takeover, no treacherous turn, no coordinated power-seeking. As AI becomes a competitive substitute for humans across economic, cultural, and political functions, human influence over the systems we depend on erodes incrementally — and the implicit alignment those systems historically had with human interests (which existed precisely because they relied on humans) dissolves once that reliance ends.

This may be more probable than acute-takeover stories because each step is locally beneficial and incentive-compatible, with no malign agent required — and self-reinforcing across domains, producing a hard-to-reverse ratchet.

Sources
Misuse + structural Severity: existential

AI-enabled authoritarianism & power concentration

AI dissolves a historical constraint on tyranny: a regime's dependence on many human enforcers whose loyalty can erode or defect. AI with "singular loyalties" lets a small group replace human personnel with systems that obey any order, removing the coup/loyalty check. Bostrom's Vulnerable World Hypothesis frames the mirror image: defending against catastrophic technologies could itself require pervasive surveillance — machinery that enables durable lock-in.

The risk is maximal because durable totalitarianism may be self-perpetuating and effectively irreversible (value lock-in).

Sources
Systemic Severity: contested

Economic disruption & mass labor displacement

Rapid, broad automation of cognitive labor could displace workers faster than the economy creates replacement roles, concentrate gains, widen inequality, and — most consequentially for the disempowerment thesis — erode the economic leverage ordinary people hold over institutions. Exposure estimates are large but disputed: OpenAI researchers estimate ~80% of US workers have ≥10% of tasks affected by LLMs; the IMF estimates ~40% of global jobs exposed; while Daron Acemoglu is a notable skeptic, projecting only ~0.5% TFP gain over 10 years.

Outcomes range from manageable churn to destabilizing inequality, depending on adoption speed and policy response.

Sources
  • Eloundou et al. (OpenAI), "GPTs are GPTs," 2023 — arxiv.org/abs/2303.10130
  • IMF, "Gen-AI and the Future of Work," 2024 — imf.org
  • Acemoglu, "The Simple Macroeconomics of AI," NBER 2024 — nber.org
Structural Severity: severe

Multi-agent & structural risk ("Moloch")

Beyond misuse and accidents lies a third lens: structural risk — harms arising from how AI reshapes incentives, competitive pressures, and power balances even when no actor misbehaves and no system malfunctions. The canonical articulation is the "Moloch" dynamic: in any competition optimizing for X, whoever sacrifices a competing value (safety, deliberation) for marginally more X wins, dragging all competitors into sacrificing it. Applied to AI: labs or states under commercial and geopolitical pressure cut safety investment to ship faster — a race to the bottom.

These pressures are already observable and resist single-actor fixes, which is precisely the argument for external coordination.

Sources
Systemic Severity: severe (diffuse)

Epistemic risk — disinformation & persuasion at scale

AI threatens the shared informational substrate societies need to coordinate, via three channels: scaled generation of cheap, tailored propaganda; superhuman persuasion (a 2025 randomized trial found GPT-4, given basic demographic data about an opponent, had ~81% higher odds of shifting agreement than human debaters); and erosion of shared reality — including the "liar's dividend," where ubiquitous synthetic media lets bad actors dismiss authentic evidence as fake.

Damage accrues to institutional trust and the epistemic commons rather than as a single catastrophic event — but it degrades the very capacity to respond to every other risk on this page.

Sources
  • Salvi et al., "On the conversational persuasiveness of GPT-4," Nature Human Behaviour, 2025 — nature.com
  • Goldstein et al., "Generative Language Models and Automated Influence Operations," 2023 — arxiv.org/abs/2301.04246
Existential Severity: maximal

The extinction-level argument

The strongest "default extinction" case rests on three pillars. First, the CAIS Statement on AI Risk (2023) — signed by Turing laureates Hinton and Bengio alongside the CEOs of all three frontier labs — moved extinction risk into the mainstream. Second, Hendrycks' "Natural Selection Favors AIs over Humans" supplies a mechanism: competitive pressures select for power-seeking, deceptive AI agents that could outcompete and displace humanity without any single catastrophic decision. Third, Yudkowsky and Soares' 2025 book "If Anyone Builds It, Everyone Dies" argues that superintelligence built with current methods leads by default to extinction.

By definition the severity is maximal; plausibility is the central live dispute — see below.

Sources
  • CAIS, "Statement on AI Risk," 2023 — safe.ai
  • Hendrycks, "Natural Selection Favors AIs over Humans," 2023 — arxiv.org/abs/2303.16200
  • Yudkowsky & Soares, If Anyone Builds It, Everyone Dies, 2025 — publisher

A note on P(doom) and how seriously experts vary

"P(doom)" expresses extinction risk as a probability, and its defining feature is how violently estimates diverge — by more than four orders of magnitude — among credentialed experts, signaling deep uncertainty rather than consensus.

Source / personEstimateNote
AI Impacts 2023 survey (n≈2,778)~5% medianMean ~16%; ⅓–½ gave ≥10%
Yann LeCun (Meta)<0.01%Calls the risk overblown
Toby Ord (The Precipice)~10%From AI, this century
Geoffrey Hinton10–20%Within 30 years (2024)
Yoshua Bengio~20%
Paul Christiano~22–46%Takeover / "irreversibly messed up"
Eliezer Yudkowsky>95%Widely-used paraphrase

These are self-reported, non-calibrated, time-sensitive intuitions — not measurements. The defensible takeaway is structural: a non-trivial fraction of the people who build and study these systems assign double-digit probability to civilization-scale catastrophe. For a risk of that severity, even a few-percent chance warrants serious precaution.

Sources
Where to go next None of these scenarios is inevitable. See Governance & Solutions for the technical and policy work aimed at preventing them, and the FAQ for the strongest objections, answered.