Learning objectives
- See the recommended order of topics before (or alongside) RL: math, programming, optional supervised learning.
- Know what this curriculum assumes and where to fill gaps.
Prerequisite roadmap (overview)
Pt 1 — Foundations
- Programming: Variables, types, conditionals, loops, functions, basic data structures (lists, dicts). Language: Python. If you have no programming, start with the Learning path Phase 0 and Prerequisites: Python.
- Probability and statistics: Sample mean, variance, expectation, law of large numbers. Used in bandits, Monte Carlo, and value functions. See Math for RL: Probability.
- Linear algebra: Vectors, dot product, matrices, matrix-vector product. Used in value approximation \(V(s) = w^T \phi(s)\) and gradients. See Math for RL: Linear algebra.
- Calculus: Derivatives, chain rule, partial derivatives. Used in policy gradients and loss minimization. See Math for RL: Calculus.
- NumPy (and optionally Pandas, Matplotlib): Arrays, indexing, random numbers, plotting. See Prerequisites: NumPy, Matplotlib, Pandas.
Pt 2 — Toward deep RL
- PyTorch or TensorFlow: Tensors, autograd, simple neural networks (forward pass, backward, optimizer). Needed for Volume 3+ (DQN, policy gradients). See Prerequisites: PyTorch or TensorFlow.
- Gym / Gymnasium: Environments,
reset(),step(), observation and reward. See Prerequisites: Gym. - Optional—supervised learning: Basic idea of loss, gradient descent, and overfitting. Helpful for understanding function approximation and DQN; not strictly required to start RL if you are comfortable with gradients and loss.
Order of study
- No programming / no math: Phase 0 → Python prerequisite → Math for RL (probability, linear algebra, calculus) → Prerequisites (NumPy, etc.) → Preliminary assessment → Volume 1.
- Programming but weak math: Math for RL → Prerequisites → Preliminary → Volume 1.
- Math and programming, no RL: Preliminary (to see what you know) → Volume 1 → Volume 2 → Volume 3+.
- Some ML, want RL: Volume 1 (quick if you know MDPs) → Volume 2 → Volume 3+.
Use the Course outline and Learning path as the main map; this roadmap shows what to shore up before or in parallel.