The Singularity

"The Singularity" is shorthand for a hypothesized moment — or interval — in which AI starts improving itself faster than humans can improve it, and the resulting feedback loop produces something we cannot extrapolate past. This page unpacks where the idea comes from, writes down the math carefully, and then lets you play with the parameters to see how much the conclusion depends on them.

Prereq: basic calculus, scaling Read time: ~20 min Interactive figures: 1 Sources: Good, Vinge, Kurzweil, Yudkowsky, Christiano

1. Where the word comes from

In mathematics, a "singularity" is a point where a function misbehaves — it blows up, becomes undefined, or otherwise stops being extrapolatable. A black hole is a gravitational singularity; a function $1/x$ has a singularity at zero. The phrase "technological singularity" borrows the idea: a point in history past which current trends can't be extended in any meaningful way because the underlying assumptions break.

The first recorded use of the phrase in this sense is attributed to John von Neumann in the 1950s — Stanislaw Ulam wrote in a 1958 eulogy that von Neumann had speculated about "the ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue."

Von Neumann didn't elaborate. The idea lay mostly dormant for half a century until AI researchers picked it up and gave it a concrete mechanism.

2. I.J. Good's 1965 argument

Irving John Good — a Bletchley Park cryptanalyst who worked with Turing during WWII — wrote a paper in 1965 titled Speculations Concerning the First Ultraintelligent Machine. The core of the paper is four sentences that have been quoted in every AGI discussion since:

Good, 1965

"Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control."

The argument is a four-step syllogism:

Designing machines is a kind of intellectual work.
A machine smarter than any human can therefore do machine design better than any human.
A machine better at machine design can design a still-better machine — which is itself better at machine design, which can design a still-still-better machine, and so on.
This recursive improvement loop is a positive-feedback system. Positive feedback with no limiting term diverges.

The provision in the last sentence — "provided that the machine is docile enough to tell us how to keep it under control" — is the origin of the alignment problem. Good saw that the same loop that produces a superintelligent ally produces a superintelligent adversary if you haven't solved the steering problem in advance.

3. The intelligence-explosion math

You can write Good's argument as a differential equation. Let $I(t)$ be some scalar "intelligence" of the system at time $t$. Good's claim is that the rate at which the system improves itself is proportional to its current intelligence — because smarter systems can produce larger improvements per unit time.

\frac{dI}{dt} = k \, I(t)

Linear-feedback intelligence explosion

$I(t)$: A scalar "effective intelligence" of the system at time $t$. In practice this is a stand-in for "quality of research output per unit wall-clock." Different authors operationalize it differently.
$k$: A rate constant — how efficiently more intelligence converts into improvements. Includes compute, data, and research-productivity factors.
$dI/dt$: The instantaneous rate of improvement.

Intuition This is the same equation as compound interest, population growth, and radioactive decay — it's the math of "rate proportional to current amount." Its solution is exponential growth: $I(t) = I_0 e^{k t}$. Every doubling takes the same amount of wall-clock time.

Exponential growth is fast but not a singularity — the function is defined for all finite $t$. To get a true singularity you need the rate to grow faster than linearly with intelligence. Good's original argument is usually read as saying intelligence enables improvements that themselves are exponential:

\frac{dI}{dt} = k \, I(t)^{1 + \epsilon}, \quad \epsilon > 0

With $\epsilon = 1$ (the rate grows as $I^2$), the equation is separable:

\int \frac{dI}{I^2} = \int k \, dt \implies -\frac{1}{I} = k t + C

I(t) = \frac{I_0}{1 - k I_0 \, t}

Hyperbolic blow-up

$t^* = 1/(k I_0)$: The "singularity time" — the wall-clock moment at which $I(t)$ goes to infinity. For any initial intelligence and rate constant, the hyperbolic solution reaches infinity in finite wall-clock time.
$\epsilon$: Controls how much faster the rate grows than linear feedback. $\epsilon = 0$ gives exponential growth (no singularity). $\epsilon > 0$ gives a finite-time blow-up. $\epsilon < 0$ gives sub-exponential growth that eventually plateaus.

What this tells you The debate about whether a "singularity" exists is mostly a debate about the sign of $\epsilon$. It is not a debate about whether AI gets better — everyone agrees it's getting better. It's a debate about whether the curve is sub-exponential (plateau), exponential (fast but open-ended), or super-exponential (finite-time blow-up).

Almost no modern treatment uses $\epsilon > 0$. Tom Davidson's "compute-centric framework for takeoff speeds" (Open Philanthropy, 2023), Paul Christiano's earlier "Takeoff speeds" post, and Epoch AI's models all use sub-linear returns — more intelligence produces more improvement, but with diminishing returns per unit. Those models still give dramatic speedups, just without the infinity.

4. Takeoff speeds — the live debate

"Takeoff speed" is the term of art for how fast the transition from roughly-human AI to vastly-superhuman AI happens. Three archetypes:

Fast / hard takeoff

Days to months. Typical cause: recursive self-improvement inside a single lab with single-digit key insights. Associated with early Yudkowsky (MIRI), also with some recent AI-2027-style scenarios. Scary but, most researchers now think, unlikely because training a new model still requires lots of real-world compute you can't wish into existence.

Slow / soft takeoff

Years. Capability diffuses across multiple labs; models are expensive to train, get adopted, deployed, integrated, and produce economic feedback that funds the next generation. Christiano's "Takeoff speeds" (2018) is the canonical defense of this view; Karnofsky's "most important century" series is a book-length version.

No takeoff

Current methods hit a wall. Scaling exhausts data, reasoning hits verification limits, agent reliability doesn't keep up. The system keeps getting better but not qualitatively; we get very good assistants, not a singularity. LeCun's position.

The thing that's changed between 2018 and 2025: the "soft takeoff" scenario is now clearly consistent with what we observe. We've had three years of dramatic, continuous-looking capability gains with multiple labs at the frontier, with capability gains being funded directly by commercial deployment. Nothing in the public record looks like a discontinuity. Whether that extrapolates is the question.

5. Interactive: takeoff simulator

The toy model below is deliberately simple. Starting from intelligence $I_0 = 1$ at $t = 0$, it integrates dI/dt = k · I^(1+ε) until either $I$ exceeds 10⁶ (superintelligence reached) or $t$ exceeds 10 years (plateau). Drag the sliders to change the rate constant $k$ and the super-linearity exponent $\epsilon$.

k (rate): 0.70 ε (curvature): 0.00

ε < 0 : sub-exponential plateau. ε = 0 : clean exponential (no singularity). ε > 0 : hyperbolic blow-up in finite time.

This is a toy model — what it is and isn't

The dynamics above are a caricature. Real takeoff models (Davidson 2023, Epoch AI 2024) use a multi-factor production function with compute, data, and labor, include feedback through empirical returns to research, and fit against observed FLOP/loss curves. But the qualitative conclusion survives: the sign of returns matters more than the magnitude. If returns to research effort are diminishing enough, no feedback loop runs away; if they're flat or increasing, all bets are off.

6. Kurzweil's curves

Ray Kurzweil's The Singularity Is Near (2005) is the book most people associate with "the singularity." His core empirical claim is called the Law of Accelerating Returns: the rate of exponential improvement in information technology is itself exponential, so that plotted on a log scale, progress curves upward rather than being straight lines.

Kurzweil's 2005 predictions, graded against 2026 reality:

Prediction	Year predicted	Reality
A $1000 computer performs at the level of the human brain (~10¹⁶ cps)	2020s	Roughly on schedule. An H100 hits ~2 × 10¹⁵ FP16 FLOPs; cluster cost per human-brain-equivalent is getting there.
Human-level machine intelligence (passes a rigorous Turing test)	2029	On track. Frontier LLMs pass short-form Turing tests today; a rigorous long-form version is not yet settled.
Most human knowledge workers augmented by AI	2020s	Happening right now. 2025 was the year "I use Claude/ChatGPT at work" became majority.
Technological singularity — AI surpasses all human intelligence combined	2045	Unresolved. Nobody has a credible benchmark for "all human intelligence combined."
Nanobots in the bloodstream keeping people indefinitely young	2030s	Speculative. Progress on senescence-related biology is real (rapamycin, NAD+, partial reprogramming) but nowhere near this description.

Kurzweil is usually derided in academic circles and lionized in futurist ones. The fair assessment: his mid-term technology predictions have aged much better than contemporary critics expected, while his long-term biology predictions have aged worse. His core methodological claim — that exponential curves hold over longer timescales than people's intuitions expect — has survived.

7. Critiques and counter-arguments

The singularity thesis has been attacked on multiple fronts. The strongest criticisms:

No-free-lunch for intelligence. Francois Chollet argues that "intelligence" is not a scalar; different tasks require qualitatively different cognitive strategies. A scalar $I(t)$ model is the wrong ontology. ARC-AGI was built to make this point empirically.
Compute bottleneck. Training a frontier model is not a software-only activity; it's a physical project requiring GPUs, power, data centers, and time. A model that is smarter than us cannot wish new data centers into existence. This alone caps any "fast takeoff" to the time it takes to build more compute — months at least, probably years.
Diminishing returns to research. Empirically, in most fields, doubling R&D spending does not double the rate of progress. Bloom, Jones, Van Reenen, Webb (2020) — Are Ideas Getting Harder to Find? — makes this empirical case at book length. If AI research follows the same pattern, recursive self-improvement gives you compound exponential growth at best, not a blow-up.
Evaluation lag. Even if a model is secretly superhuman, we might not be able to verify it with current evals. That cuts both ways — it means we might not notice a takeoff — but it also means claims of imminent singularity are currently unfalsifiable.
Coordination, not cognition, is the binding constraint. Robin Hanson's argument: the limit to human civilization is not IQ but coordination failures — trust, agency problems, institutional friction. A smarter AI doesn't trivially solve those, because they're adversarial rather than intellectual.

The strongest pro-singularity responses:

The trend line is what matters. Every benchmark that was supposed to survive for 5 years has survived 1. That's not consistent with "we're about to hit a wall."
Compute is not as hard to scale as cognition. $500B Stargate-level investments show the physical bottleneck is being attacked aggressively. Power is the real constraint and it is being solved with nuclear, solar, and grid build-out.
Automated AI research is already showing returns. 2024–25 saw multiple papers (METR, OpenAI, DeepMind) in which LLMs meaningfully contribute to AI research itself. The self-improvement loop isn't hypothetical — it's 10% engaged already.

8. What to take away

"The Singularity" is a mathematical concept — a finite-time blow-up in a feedback system — that AI researchers imported by analogy. The math is real and simple; whether current AI matches the math is a judgment call.
Good's 1965 intelligence-explosion argument is still the canonical formulation. Modern treatments (Christiano, Davidson, Yudkowsky, Epoch AI) are mostly arguing over the sign and magnitude of the feedback exponent.
A singularity in the strict finite-time-blow-up sense requires super-linear returns to cognition. Almost no serious model has this. Most expect dramatic exponential growth, not literal infinity.
Takeoff-speed debates are about years vs. months vs. decades. The fastest credible takeoff path goes through automated AI research (PASTA, AI-2027). The slowest goes through data and compute bottlenecks forcing linear progress.
Kurzweil's 2029 human-level-AI prediction has aged well. His 2045 singularity prediction is still open — everyone has moved from "when?" to "how well does it go?"
The honest assessment: we are certainly in an exponential growth regime of AI capabilities. Whether it's the start of a singularity or the middle of a long S-curve is not yet distinguishable from the inside.