RL
Overall Progress
0%
15 problems combining Python, probability, and toy RL. Complete before starting Volume 1.
NumPy for RL: arrays, indexing, broadcasting, random, and batch operations.
PyTorch for RL: tensors, autograd, nn.Module, optimizers, and GPU.
TensorFlow and Keras for RL: models, GradientTape, optimizers, and GPU.
Toy recommender, 100 items, changing user; maximize engagement.
Broken SAC: unit tests, logging Q/reward/entropy; diagnose.