Learning Path

Anchor scenarios used throughout the learning path and curriculum to ground RL concepts in practice.

Install Python, run your first script, and learn variables, conditionals, loops, and functions before RL.

Probability, statistics, linear algebra, and calculus with RL-motivated examples. Read in order: 1a → 1d → self-check.

Python, NumPy, PyTorch, Gym/Gymnasium, and related tools the curriculum assumes. Complete tasks on the prerequisites index, then the Phase 2 quiz.

Deeper pass through the same math areas as Phase 1, with more drills and RL-motivated examples before you start the core RL volumes.

Supervised learning, regression, classification, gradient descent, and evaluation—before neural networks for RL.

Neural networks, backpropagation, CNNs, PyTorch patterns, and a mini-project—directly reusable for DQN, policies, and actor-critic.

Volumes 1–2: MDPs, dynamic programming, Monte Carlo, TD, SARSA, and tabular Q-learning. Core theory before function approximation.

Volumes 3–5: value function approximation, DQN family, policy gradients, actor-critic, and advanced policy optimization (chapters 21–50).

Volumes 6–10: model-based RL, exploration, offline RL, MARL, real-world RL, safety, and RL with LLMs (chapters 51–100).

Real-World Scenarios in This Curriculum