World models

World models are learned internal representations of the outside world, in particular its dynamics, that enable an agent to simulate, plan and generalize.

Reconstruction-based: Dreamer, PlaNet
JEPA: TD-MPC, SPR

Most interesting line of work (near SOTA on Atari100k):

MuZero does MCTS with learned latent space rollouts but did not include a self-predictive loss.
EfficientZero added self-predictive loss
EfficientZero V2 used Gumbel sampling instead of MCTS

Conspicuously missing are approaches to learn action abstractions in the vein of options in hierarchical RL. Some recent work is beginning to address the gap, but overall this is a significant open research idea.

Links

Predictive Learning
Reinforcement learning
PhD research plan final year
Joint Embedding Predictive Architectures
Deep Learning

Sources

Machine Learning

LeCun (2022) - A Path Towards Autonomous Machine Intelligence Version 0.9.2, 2022-06-27
Ha and Schmidhuber (2018) - World Models
Hafner et al. (2019) - Learning Latent Dynamics for Planning from Pixels
Hafner et al. (2025) - Mastering diverse control tasks through world models
Hansen, Su and Wang (2024) - TD-MPC2 Scalable, Robust World Models for Continuous Control
Schwarzer et al. (2021) - Data-Efficient Reinforcement Learning with Self-Predictive Representations
Schrittwieser et al. (2020) - Mastering Atari, Go, chess and shogi by planning with a learned model
Ye et al. (undefined) - Mastering Atari Games with Limited Data
Kobayashi et al. (2025) - Emergent temporal abstractions in autoregressive models enable hierarchical reinforcement learning

Neuroscience

Levenstein et al. (2024) - Sequential predictive learning is a unifying theory for hippocampal representation and replay