People whose work I keep returning to. Most sit at the seams between fields — physics and biology, math and computation, energy and information. Reverse chronological, newest first.

A constellation of minds
Ideas threaded through time
Each person has their own column; the bar spans their lifespan — birth to death, or to today for the living — with the node at its midpoint. Newest at the top. Hover a name to trace the ideas it shares with others.

Sergey Levine

Berkeley professor whose work casts reinforcement learning, control, and planning as variational inference under one Bayesian frame. His 2018 RL and Control as Probabilistic Inference tutorial made the equivalence exact rather than analogical: a maximum-entropy control objective is a variational free energy — the same evidence lower bound that turns up in approximate inference. That identity is the thread I keep returning to whenever I think about decision-making under uncertainty.

Richard Sutton (1978– )

Co-author, with Andrew Barto, of Reinforcement Learning: An Introduction and the temporal-difference methods at the core of the field — work that earned them the 2024 Turing Award. The Bitter Lesson is the essay I keep coming back to: the long-run winners are general methods that scale with computation, not the cleverness we hand-build into them.

Satoshi Nakamoto

Pseudonymous author of the 2008 Bitcoin white paper, and — whoever they were — the architect of a genuinely new data structure: a hash-linked chain of blocks, each sealed by a Merkle tree of its transactions, extended by proof-of-work and kept honest by the rule that the longest valid chain wins. What I keep returning to is not the currency but the algorithm beneath it — Nakamoto consensus, the first practical solution to the Byzantine Generals Problem in an open network where anyone can join, no one is trusted, and there is no central authority to call the vote. It made distributed agreement among mutually suspicious strangers an engineering reality, assembling a handful of old cryptographic primitives — hashing, Merkle trees, Hashcash-style proof-of-work — into something nobody had quite seen before.

Marc Toussaint

Roboticist (TU Berlin) who turned planning and trajectory optimization into approximate inference — solving Markov decision problems by message passing on a graphical model rather than by dynamic programming. The work made planning as inference a usable engineering tool, and is part of why casting control as Bayesian computation reads less like a metaphor than like the right level of description.

Joshua Tenenbaum (1972– )

MIT cognitive scientist reverse-engineering the mind as Bayesian inference — intuitive physics, concept learning from a single example, and a Bayesian theory of mind that casts how we read other people’s beliefs and goals as inverse planning over generative models. Because Bayesian updating is itself variational free-energy minimization, this sits on the same footing as Friston’s free energy and the control-as-inference line (Kappen, Toussaint, Levine) — the clearest demonstration I know that the structured, few-shot character of human thought is itself probabilistic inference.

Michael Levin

Tufts biologist studying bioelectricity, regeneration, and the agency of non-neural tissue — xenobots, anthrobots, and a body of work arguing that cells navigate a platonic space of possible forms. The clearest current case I know that intelligence and goal-directedness predate brains by a long way.

David Krakauer (1967– )

Evolutionary theorist and president of the Santa Fe Institute, where he studies the deep history of intelligence as a problem in information and complexity — how living and cultural systems acquire, store, and process information, and how that capacity undergoes major evolutionary transitions. He gives the loose intuition that life is “information that copies itself” a rigorous shape, and his framing of complexity science is a big part of how I think about where intelligence comes from.

Yann LeCun (1960– )

Made convolutional networks work for real, decades before the field caught up, and shared the 2018 Turing Award for it. What I follow now is his insistence that prediction and self-supervised world models — not ever-bigger language models — are the road to machines that actually understand.

Stephen Wolfram (1959– )

Built Mathematica, then spent decades arguing that simple computational rules — most famously cellular automata like rule 30 — can generate the complexity we see in nature. A New Kind of Science and the Wolfram Physics Project are the most ambitious recent attempts to take “the universe is computation” literally as a research program.

Karl Friston (1959– )

The free energy principle and active inference — a single variational quantity unifying perception, action, and learning under Bayesian inference. It is the same quantity the control-as-inference line (Kappen, Toussaint, Levine) reaches from the other side: maximum-entropy and KL-regularized control objectives are free energies, not merely analogous to them. Living evidence that big foundational frames are still being written.

Hilbert J. Kappen

Biophysicist at Radboud University whose work on KL-control and path-integral control showed that a whole class of stochastic optimal-control problems becomes linearly solvable once the cost is written as a Kullback–Leibler divergence. One of the cleanest demonstrations that control and inference are a single problem — the optimal controller is a posterior, the control cost a free energy — and a direct root of the maximum-entropy reinforcement learning that came after.

Avi Wigderson (1956– )

Computational complexity theorist who, more than anyone, mapped the role of randomness in computation — when a random algorithm genuinely buys you power, and when that randomness can be derandomized away. Turing Award 2023 (and an Abel Prize too). His work is the rigorous backbone under my hunch that randomness is in the eye of the beholder.

Geoffrey Hinton (1947– )

Decades of work on neural networks before they were fashionable — Boltzmann machines, backpropagation as a learning rule, and the AlexNet moment that finally vindicated it all. Nobel in physics, 2024. The patience and the breadth — statistical physics into learning into language — is what I keep noticing.

Ray Solomonoff (1926–2009)

Defined a universal prior over computable hypotheses by weighting them by their description length — Solomonoff induction, the Bayes-optimal predictor and one half of AIXI. The original answer to “how should one learn from data?”, and the answer everything else still approximates.

Edwin Thompson Jaynes (1922–1998)

Made Bayesian probability and the maximum-entropy principle into a working methodology. Probability Theory: The Logic of Science is one of those books I want to have read three times. Argued, persuasively, that statistical mechanics and inference are the same activity seen from different angles.

Richard Feynman (1918–1988)

QED, path integrals, the diagrams that bear his name. But what inspires me is the style — the insistence on understanding things from scratch, the joy in figuring things out, the conviction that if you cannot explain it simply you do not understand it.

Ilya Prigogine (1917–2003)

Showed that systems far from equilibrium can spontaneously organize into ordered structures by dissipating energy — dissipative structures. Made it possible to talk about life, weather, and cities as physics, not exceptions to it. Nobel in chemistry, 1977.

Claude Shannon (1916–2001)

Invented information theory in one 1948 paper. Defined entropy as the expected log-improbability of a message, and from that single quantity reconstructed bandwidth, channel capacity, error correction, and how every form of communication has worked since.

Alan Turing (1912–1954)

Defined what computation is. Then asked whether machines could think, and on the side wrote a paper showing how chemical reactions can generate biological pattern (morphogenesis). Range matched only by depth.

John von Neumann (1903–1957)

The polymath ideal. Foundations of quantum mechanics, the computer architecture every modern machine still uses, game theory, cellular automata, self-replicating machines — all in one career, none of it shallow.

Andrey Kolmogorov (1903–1987)

Soviet polymath. Axiomatized probability theory in 1933, then half a century later co-founded algorithmic information theory — Kolmogorov complexity, the length of the shortest program that produces a string, the substrate Solomonoff induction sits on. Few people have written down so much of what we now take as given.

R.T. Cox (1898–1991)

Showed that any system of plausible reasoning consistent with a few mild desiderata must obey the rules of probability — Cox’s theorem (1946). The clearest derivation of why probability is the calculus of belief, not just one option among many.

Buckminster Fuller (1895–1983)

Designer, inventor, and relentless generalist who treated the whole planet as an engineering problem — geodesic domes, tensegrity, synergetics, “doing more with less.” Operating Manual for Spaceship Earth is the systems-thinking manifesto I keep recommending. Few people insisted so hard that design is a moral act.

Norbert Wiener (1894–1964)

Founded cybernetics — the study of control and communication in the animal and the machine — and in doing so put feedback, prediction, and information at the center of how we understand both biology and engineering. The intellectual headwater that learning, control, and the free energy principle all flow down from.

Harold Jeffreys (1891–1989)

Geophysicist who, in Theory of Probability (1939), built modern Bayesian statistics — priors, posterior odds, model comparison — decades before any of it was respectable. The Jeffreys prior still bears his name.

Abraham Fraenkel (1891–1965)

Refined and completed Zermelo’s axioms into the system we now call ZFC — the F is his. Most working mathematicians never think about the foundations they stand on, which is exactly the measure of how well he and Zermelo poured them.

George Pólya (1887–1985)

Wrote How to Solve It, the field manual for mathematical thinking. Patterns of Plausible Inference set up the plausibility-as-extended-logic frame that Cox and Jaynes later formalized. Heuristics, plausible reasoning, and combinatorics in equal measure.

Erwin Schrödinger (1887–1961)

Wrote down the wave equation, then in What Is Life? (1944) asked how the laws of physics could give rise to the informational order of living systems. That question helped seed molecular biology, and I think it is still the right question.

Albert Einstein (1879–1955)

Relativity rewrote space and time; the photoelectric effect helped launch quantum theory; and his 1905 paper on Brownian motion turned the random jitter of suspended particles into the decisive evidence that atoms are real. Three revolutions in a single year, and the one I keep returning to is the statistical-mechanics argument that made the invisible countable.

Ernst Zermelo (1871–1953)

Gave set theory its first axioms in 1908 — the Z in ZFC — and isolated the axiom of choice as the load-bearing assumption it really is. The foundation most of mathematics quietly stands on was poured here. He also picked the famous recurrence fight with Boltzmann over whether entropy could truly increase, which is its own kind of inspiring.

Hermann Minkowski (1864–1909)

Einstein’s former teacher, who saw that special relativity was really a statement about geometry — that space and time fuse into a single four-dimensional fabric. “Henceforth space by itself, and time by itself, are doomed to fade away.” He gave relativity the language it still speaks in.

David Hilbert (1862–1943)

German mathematician who set the agenda for twentieth-century mathematics — the 23 problems, the formalist program, and Hilbert spaces, the infinite-dimensional geometry that both quantum mechanics and modern machine learning live in. His Entscheidungsproblem — is there a procedure to decide any mathematical statement? — is the question Turing answered by inventing the computer.

Max Planck (1858–1947)

To explain the spectrum of blackbody radiation, he was forced to assume energy comes in discrete packets — a move he called an act of desperation, and which cracked open the quantum era he never fully made peace with. The constant that bears his name sets the scale of the small.

Giuseppe Peano (1858–1932)

Gave arithmetic its axioms — the small set of assumptions from which the natural numbers and induction follow — and invented much of the symbolic notation that modern logic and set theory are written in. Foundations built so cleanly they became invisible, which is the highest compliment.

Heinrich Hertz (1857–1894)

Took Maxwell’s equations off the page and into the lab — generating and detecting electromagnetic waves, proving that light and radio are the same phenomenon at different wavelengths. The experiment that turned a beautiful theory into an undeniable fact. Dead at 36.

Nikola Tesla (1856–1943)

Engineer-visionary with an almost preternatural feel for electromagnetic phenomena. Alternating current, the induction motor, polyphase distribution — the substrate of modern electrification. Imagined wireless power transmission decades before the math caught up.

Hendrik Lorentz (1853–1928)

Worked out the transformations that carry one observer’s space and time into another’s — the mathematical machinery Einstein later reinterpreted as relativity. A bridge figure between classical electromagnetism and the new physics, and by all accounts the gentlest of its founders.

Felix Klein (1849–1925)

His Erlangen Program recast all of geometry as the study of what stays invariant under a group of transformations — a unifying idea so deep it still structures how I think about symmetry in physics and machine learning alike. Geometry is group theory, seen from the right angle.

Ludwig Boltzmann (1844–1906)

Built statistical mechanics. S = k log W carved entropy into the bridge between microscopic chaos and macroscopic order — the equation that makes thermodynamics, information theory, and most of what comes after possible.

Josiah Willard Gibbs (1839–1903)

American physicist who founded statistical mechanics and chemical thermodynamics almost single-handedly — ensembles, free energy, the Gibbs distribution that still shows up everywhere from physics to machine learning. He worked in near-total isolation at Yale, publishing in an obscure journal, and quietly built half the vocabulary I use to think about energy and probability.

James Clerk Maxwell (1831–1879)

Unified electricity, magnetism, and light into a single set of equations — the first true field theory, and the template for every one since. His kinetic theory gave us the Maxwell–Boltzmann distribution, and Maxwell’s demon still frames how I think about the link between entropy, information, and the cost of knowing.

Jules Antoine Lissajous (1822–1880)

French physicist who studied vibration and sound — and found a way to see it. By bouncing light off small mirrors fixed to two vibrating tuning forks, he turned a pair of perpendicular oscillations into the standing curves that now bear his name. Lissajous figures are still how we read frequency and phase off an oscilloscope — an early, beautiful case of making an invisible dynamical relationship visible.

Hermann von Helmholtz (1821–1894)

One of the last people to be a master of physics and physiology at once. He stated the conservation of energy with full generality, introduced the free energy that still bears his name, and framed perception as unconscious inference — the brain as a prediction machine — a century before that became the going theory. A direct ancestor of how I think about both thermodynamics and the mind.

Ada Lovelace (1815–1852)

Writing about Babbage’s Analytical Engine, she published what is generally called the first algorithm intended for a machine — and, more strikingly, grasped that such a machine could manipulate any symbols, not just numbers: music, language, anything representable. The first person to see the computer rather than the calculator.

Charles Darwin (1809–1882)

Showed that the staggering complexity and apparent design of living things needs no designer — only descent with modification under natural selection, run for long enough. On the Origin of Species is the clearest demonstration I know that a simple, mechanical process, iterated, can climb toward complex adaptation, and it sits underneath nearly every later idea here about how intelligence and order arise without being put there on purpose.

Charles Babbage (1791–1871)

Designed the Difference Engine, then the far more ambitious Analytical Engine — a general-purpose, programmable mechanical computer, conceived from the early 1830s, with a store, a mill, and punched-card control. The truly revolutionary idea: a single machine that could be told to compute anything. He never finished building it; the design was a century ahead of the tools to realize it.

Pierre-Simon Laplace (1749–1827)

French mathematician and astronomer who did more than anyone to turn Bayes’ insight into a working science. He independently derived the rule for updating on evidence, gave us the rule of succession and the principle that flat priors encode ignorance, and put it to work — estimating the mass of Saturn from noisy observations and bounding his own error. His Laplace’s demon is still the cleanest statement of what perfect prediction would mean, and the foil every account of probability and chaos has had to answer since.

Joseph-Louis Lagrange (1736–1813)

Italian-French mathematician who recast Newtonian mechanics as a variational principle: nature extremizes an action, and the equations of motion fall out of it. The Mécanique analytique did all of physics with calculus and no diagrams, and the calculus of variations he built with Euler is the same machinery that turns up later in least action, optimal control, and free-energy objectives.

David Hume (1711–1776)

Scottish Enlightenment philosopher and empiricist whose skepticism cut to the root of inference. He argued that we never observe causation itself, only constant conjunction — our sense of cause and effect is a habit of mind, not a logical necessity — and posed the problem of induction: no number of past observations can logically guarantee the next one. It is the sharpest challenge any account of learning from data has to answer, and the reason the replies that followed (Bayes, Laplace, Jaynes) had to be probabilistic rather than certain.

Leonhard Euler (1707–1783)

The most prolific mathematician who ever lived, and the source of much of the notation we still use — e, i, f(x), Σ. He founded graph theory with the bridges of Königsberg, did foundational work across analysis and mechanics, and kept producing after going blind. The Euler–Lagrange equations are a quiet ancestor of nearly every optimization principle on this page.

Thomas Bayes (1701–1761)

British minister and mathematician. An Essay towards solving a Problem in the Doctrine of Chances, published posthumously in 1763, gave us Bayes’ theorem — the rule for updating beliefs in light of evidence. Probabilistic inference, machine learning, and the platonic optimal agent all trace back to one short paper.

Jacob Bernoulli (1655–1705)

Swiss mathematician whose posthumous Ars Conjectandi gave probability its first limit theorem — the law of large numbers, the proof that long-run frequencies converge on the underlying chances. It is the bridge between Bayes’ rule for a single update and the idea that probability is something the world will reveal if you watch it long enough.

Gottfried Wilhelm Leibniz (1646–1716)

In the 1670s he built a machine that could multiply and divide, and — more importantly — dreamed of a calculus ratiocinator, a mechanical method for reasoning, alongside the binary arithmetic that every computer now runs on. He is the key conceptual ancestor: the first to imagine reducing reasoning itself to calculation.

Isaac Newton (1643–1727)

Invented calculus (alongside Leibniz), wrote down the laws of motion and universal gravitation, and unified the heavens and the earth under a single mathematics — the Principia. The prototype of the whole enterprise: that the world is law-governed and the laws are writable down. Almost everyone else here is working in the space he opened.

Francis Bacon (1561–1626)

English philosopher and statesman, the father of empiricism and the scientific method. In Novum Organum he championed inductive reasoning — drawing general laws from systematic observation and experiment — against reliance on pure deduction. That turn, from authority and pure logic toward learning from data, is the philosophical seed of nearly everything downstream here: Bernoulli’s frequencies, Bayes’ updating, and inference as a way of knowing the world.

Marcus Aurelius (121–180)

Roman emperor and Stoic. The Meditations — written to himself, never meant for publication — are the clearest reminder I know that the work of being a person is mostly internal, and that most of the obstacles in the way are you. Two millennia later, still useful at 6 AM.

Aristotle (384–322 BCE)

Plato’s student and the first to make logic itself an object of study — the syllogism and the rules of valid inference, which stood essentially unrevised for two thousand years. But the range is the thing: physics, biology, ethics, poetics, politics, metaphysics, each one founded or reshaped, much of it built up from direct observation of the world. The original case that a single mind can take all of knowledge as its subject.

Plato (c. 428–348 BCE)

The Republic, the Theory of Forms, the dialogue as a method of inquiry. Most of Western philosophy is still working in his vocabulary, and the platonic in “platonic ideal” is not metaphorical — it is the conviction that perfect forms exist beyond the noisy instances we encounter, and that thinking is partly the work of remembering them.