Preliminary

Overall Progress 0%

Functions, lists, loops, and list comprehensions โ€” with RL-relevant examples and explained solutions.

Arrays, indexing, slicing, and element-wise vs matrix operations โ€” with RL-relevant examples and explanations.

Sample mean, variance, expectation, and law of large numbers โ€” with bandit-style problems and explained solutions.

Vectors, dot product, matrix-vector product, and gradients โ€” with RL motivation and explained solutions.

Derivatives, chain rule, sigmoid and softmax โ€” with RL motivation and explained solutions.

Agent, environment, state, action, reward, Markov property, exploration-exploitation, and discount factor โ€” with explanations.

Dynamic programming, Monte Carlo vs TD, on-policy vs off-policy, and Q-learning โ€” with explanations and examples.

V^ฯ€(s), Q^ฯ€(s,a), and the Bellman expectation equation โ€” with worked examples and explanations.

Why FA, policy gradient update, DQN exploration, experience replay, and actor-critic โ€” with explanations.

Tensors, requires_grad, backward, and autograd โ€” with RL-relevant examples and explanations.

Reflect on your readiness across math, Python, NumPy, PyTorch, and RL concepts before starting the curriculum.