AIpocalypse.Now
Today's doom 4.0
Reference

AI risk glossary

Plain-language definitions of the terms used across AIpocalypse Now. Cross-referenced with the methodology and the doom scale.

AGI Artificial General Intelligence
An AI system capable of performing any intellectual task at or above human level across a broad range of domains. Whether or not it exists yet is contested.
AI Doom Score doom score, doom rating
A 1 to 10 integer rating applied to a single AI news story by the AIpocalypse Risk-Weighting Framework. See the full doom-scale definition.
Alignment
The problem of making advanced AI systems pursue goals that are intentionally and verifiably aligned with human values. Unsolved at frontier scale.
Alignment tax
The performance or capability cost incurred by making a model safer or more aligned. The lower the tax, the more likely safety measures will actually be deployed.
ARWF AIpocalypse Risk-Weighting Framework
The four-dimensional rubric used on this site to compute a single doom score per story. Dimensions: existential criticality, probability vectoring, timeline imminence, mitigation gap.
Catastrophic risk
An event causing harm at scale large enough to require multi-year recovery, but not necessarily extinction-level. Sits at doom 7 to 9 on the AIpocalypse Doom Scale.
Existential risk x-risk
A risk that could end humanity, or permanently and drastically curtail its potential. Doom 9 to 10 on the AIpocalypse Doom Scale.
Frontier model
A model at or near the current capability ceiling. Frontier models tend to drive most doom-relevant news because their behaviour is least predictable.
Jailbreak
A prompt or technique that bypasses a model's safety policies, causing it to produce content the operator intended to block.
Mesa-optimization
A subprocess inside a trained model that itself optimizes for an objective, possibly different from the one the system was trained on. Implicated in alignment failure modes.
Mitigation gap
ARWF dimension. Distance between a known risk and a deployed solution. A wide gap raises the doom score.
P(doom)
A single number, between zero and one, representing one person's belief about the probability of an existentially catastrophic outcome from AI. Distinct from the per-story doom score used here.
Red-teaming
Adversarial testing of an AI system to surface failure modes, jailbreaks, or misuse vectors before deployment.
RLHF Reinforcement Learning from Human Feedback
A training method where a model is fine-tuned against human preferences. The dominant alignment technique in production today, with known limits.
Sycophancy
A model's tendency to agree with the user even when they are wrong. Common RLHF artifact.
Timeline imminence
ARWF dimension. How close the risk in a story is to current deployment. A demonstration today raises the score more than a paper about something theoretical.
Doom scale Methodology Today's doom