The Singularity

"The Singularity" is shorthand for a hypothesized moment — or interval — in which AI starts improving itself faster than humans can improve it, and the resulting feedback loop produces something we cannot extrapolate past. This page unpacks where the idea comes from, writes down the math carefully, and then lets you play with the parameters to see how much the conclusion depends on them.

Prereq: basic calculus, scaling Read time: ~20 min Interactive figures: 1 Sources: Good, Vinge, Kurzweil, Yudkowsky, Christiano

1. Where the word comes from

In mathematics, a "singularity" is a point where a function misbehaves — it blows up, becomes undefined, or otherwise stops being extrapolatable. A black hole is a gravitational singularity; a function $1/x$ has a singularity at zero. The phrase "technological singularity" borrows the idea: a point in history past which current trends can't be extended in any meaningful way because the underlying assumptions break.

The first recorded use of the phrase in this sense is attributed to John von Neumann in the 1950s — Stanislaw Ulam wrote in a 1958 eulogy that von Neumann had speculated about "the ever accelerating progress of technology and changes in the mode of human life, which gives the appearance of approaching some essential singularity in the history of the race beyond which human affairs, as we know them, could not continue."

Von Neumann didn't elaborate. The idea lay mostly dormant for half a century until AI researchers picked it up and gave it a concrete mechanism.

2. I.J. Good's 1965 argument

Irving John Good — a Bletchley Park cryptanalyst who worked with Turing during WWII — wrote a paper in 1965 titled Speculations Concerning the First Ultraintelligent Machine. The core of the paper is four sentences that have been quoted in every AGI discussion since:

Good, 1965

"Let an ultraintelligent machine be defined as a machine that can far surpass all the intellectual activities of any man however clever. Since the design of machines is one of these intellectual activities, an ultraintelligent machine could design even better machines; there would then unquestionably be an 'intelligence explosion,' and the intelligence of man would be left far behind. Thus the first ultraintelligent machine is the last invention that man need ever make, provided that the machine is docile enough to tell us how to keep it under control."

The argument is a four-step syllogism:

  1. Designing machines is a kind of intellectual work.
  2. A machine smarter than any human can therefore do machine design better than any human.
  3. A machine better at machine design can design a still-better machine — which is itself better at machine design, which can design a still-still-better machine, and so on.
  4. This recursive improvement loop is a positive-feedback system. Positive feedback with no limiting term diverges.

The provision in the last sentence — "provided that the machine is docile enough to tell us how to keep it under control" — is the origin of the alignment problem. Good saw that the same loop that produces a superintelligent ally produces a superintelligent adversary if you haven't solved the steering problem in advance.

3. The intelligence-explosion math

You can write Good's argument as a differential equation. Let $I(t)$ be some scalar "intelligence" of the system at time $t$. Good's claim is that the rate at which the system improves itself is proportional to its current intelligence — because smarter systems can produce larger improvements per unit time.

$$\frac{dI}{dt} = k \, I(t)$$

Linear-feedback intelligence explosion

$I(t)$
A scalar "effective intelligence" of the system at time $t$. In practice this is a stand-in for "quality of research output per unit wall-clock." Different authors operationalize it differently.
$k$
A rate constant — how efficiently more intelligence converts into improvements. Includes compute, data, and research-productivity factors.
$dI/dt$
The instantaneous rate of improvement.

Intuition This is the same equation as compound interest, population growth, and radioactive decay — it's the math of "rate proportional to current amount." Its solution is exponential growth: $I(t) = I_0 e^{k t}$. Every doubling takes the same amount of wall-clock time.

Exponential growth is fast but not a singularity — the function is defined for all finite $t$. To get a true singularity you need the rate to grow faster than linearly with intelligence. Good's original argument is usually read as saying intelligence enables improvements that themselves are exponential:

$$\frac{dI}{dt} = k \, I(t)^{1 + \epsilon}, \quad \epsilon > 0$$

With $\epsilon = 1$ (the rate grows as $I^2$), the equation is separable:

$$\int \frac{dI}{I^2} = \int k \, dt \implies -\frac{1}{I} = k t + C$$
$$I(t) = \frac{I_0}{1 - k I_0 \, t}$$

Hyperbolic blow-up

$t^* = 1/(k I_0)$
The "singularity time" — the wall-clock moment at which $I(t)$ goes to infinity. For any initial intelligence and rate constant, the hyperbolic solution reaches infinity in finite wall-clock time.
$\epsilon$
Controls how much faster the rate grows than linear feedback. $\epsilon = 0$ gives exponential growth (no singularity). $\epsilon > 0$ gives a finite-time blow-up. $\epsilon < 0$ gives sub-exponential growth that eventually plateaus.

What this tells you The debate about whether a "singularity" exists is mostly a debate about the sign of $\epsilon$. It is not a debate about whether AI gets better — everyone agrees it's getting better. It's a debate about whether the curve is sub-exponential (plateau), exponential (fast but open-ended), or super-exponential (finite-time blow-up).

Almost no modern treatment uses $\epsilon > 0$. Tom Davidson's "compute-centric framework for takeoff speeds" (Open Philanthropy, 2023), Paul Christiano's earlier "Takeoff speeds" post, and Epoch AI's models all use sub-linear returns — more intelligence produces more improvement, but with diminishing returns per unit. Those models still give dramatic speedups, just without the infinity.

4. Takeoff speeds — the live debate

"Takeoff speed" is the term of art for how fast the transition from roughly-human AI to vastly-superhuman AI happens. Three archetypes:

Fast / hard takeoff

Days to months. Typical cause: recursive self-improvement inside a single lab with single-digit key insights. Associated with early Yudkowsky (MIRI), also with some recent AI-2027-style scenarios. Scary but, most researchers now think, unlikely because training a new model still requires lots of real-world compute you can't wish into existence.

Slow / soft takeoff

Years. Capability diffuses across multiple labs; models are expensive to train, get adopted, deployed, integrated, and produce economic feedback that funds the next generation. Christiano's "Takeoff speeds" (2018) is the canonical defense of this view; Karnofsky's "most important century" series is a book-length version.

No takeoff

Current methods hit a wall. Scaling exhausts data, reasoning hits verification limits, agent reliability doesn't keep up. The system keeps getting better but not qualitatively; we get very good assistants, not a singularity. LeCun's position.

The thing that's changed between 2018 and 2025: the "soft takeoff" scenario is now clearly consistent with what we observe. We've had three years of dramatic, continuous-looking capability gains with multiple labs at the frontier, with capability gains being funded directly by commercial deployment. Nothing in the public record looks like a discontinuity. Whether that extrapolates is the question.

5. Interactive: takeoff simulator

The toy model below is deliberately simple. Starting from intelligence $I_0 = 1$ at $t = 0$, it integrates dI/dt = k · I^(1+ε) until either $I$ exceeds 10⁶ (superintelligence reached) or $t$ exceeds 10 years (plateau). Drag the sliders to change the rate constant $k$ and the super-linearity exponent $\epsilon$.

k (rate): 0.70 ε (curvature): 0.00

ε < 0 : sub-exponential plateau. ε = 0 : clean exponential (no singularity). ε > 0 : hyperbolic blow-up in finite time.

This is a toy model — what it is and isn't

The dynamics above are a caricature. Real takeoff models (Davidson 2023, Epoch AI 2024) use a multi-factor production function with compute, data, and labor, include feedback through empirical returns to research, and fit against observed FLOP/loss curves. But the qualitative conclusion survives: the sign of returns matters more than the magnitude. If returns to research effort are diminishing enough, no feedback loop runs away; if they're flat or increasing, all bets are off.

6. Kurzweil's curves

Ray Kurzweil's The Singularity Is Near (2005) is the book most people associate with "the singularity." His core empirical claim is called the Law of Accelerating Returns: the rate of exponential improvement in information technology is itself exponential, so that plotted on a log scale, progress curves upward rather than being straight lines.

Kurzweil's 2005 predictions, graded against 2026 reality:

PredictionYear predictedReality
A $1000 computer performs at the level of the human brain (~10¹⁶ cps)2020sRoughly on schedule. An H100 hits ~2 × 10¹⁵ FP16 FLOPs; cluster cost per human-brain-equivalent is getting there.
Human-level machine intelligence (passes a rigorous Turing test)2029On track. Frontier LLMs pass short-form Turing tests today; a rigorous long-form version is not yet settled.
Most human knowledge workers augmented by AI2020sHappening right now. 2025 was the year "I use Claude/ChatGPT at work" became majority.
Technological singularity — AI surpasses all human intelligence combined2045Unresolved. Nobody has a credible benchmark for "all human intelligence combined."
Nanobots in the bloodstream keeping people indefinitely young2030sSpeculative. Progress on senescence-related biology is real (rapamycin, NAD+, partial reprogramming) but nowhere near this description.

Kurzweil is usually derided in academic circles and lionized in futurist ones. The fair assessment: his mid-term technology predictions have aged much better than contemporary critics expected, while his long-term biology predictions have aged worse. His core methodological claim — that exponential curves hold over longer timescales than people's intuitions expect — has survived.

7. Critiques and counter-arguments

The singularity thesis has been attacked on multiple fronts. The strongest criticisms:

The strongest pro-singularity responses:

8. What to take away

Further reading

NEXT UP
→ AI & Society

The math is abstract. The effects on labor, governance, and public epistemics are already measurable. The next page is sourced claims only — what has actually happened, what is measured, what credible forecasts say.