The platonic optimal agent, given a reward/loss specification, is Solomonoff induction[1] as the world model and sequential Bayes-optimal expected-reward maximization as the action rule[2]. Both are uncomputable, and even their computable restrictions are generally intractable. The history of modern intelligent agents and robotics is the history of figuring out which approximations to make, over hypothesis class, prior, inference scheme, and planning budget, for a given agent’s embodiment, environment, and task.​​​​​​​​​​​​​​​​ With this perspective, progress is no longer a search through disconnected algorithms but a more structured navigation of approximation space.

References

[1]: Solomonoff, R.J. (1964). A Formal Theory of Inductive Inference, Parts I & II. Information and Control 7(1–2), 1–22, 224–254.

[2]: Hutter, M. (2005). Universal Artificial Intelligence: Sequential Decisions Based on Algorithmic Probability. Springer.

Further reading