Bandits: Why don't we just use a library?

Learning objectives Understand why this curriculum has you implement bandits (and other algorithms) from scratch. Know when it is appropriate to switch to a library in practice. Why implement from scratch? Understanding: Writing the update equations and selection rules yourself forces you to understand how they work. If you only call library.solve(), you may not know what step size, prior, or exploration rule is being used—or how to debug when things go wrong. ...

March 10, 2026 · 2 min · 289 words · codefrydev

Other Libraries

Optional tools you may encounter or use alongside the curriculum: JAX for fast autograd and JIT, Stable-Baselines3 for ready-made algorithms, and Weights & Biases for experiment tracking. No need to master these before starting; refer back when an exercise or chapter mentions them. JAX What: Autograd and JIT compilation; functional style; used in research (Brax, RLax, many papers). Concepts: jax.grad, jax.jit, jax.vmap, arrays similar to NumPy. GPU/TPU without explicit device code. When: Chapters or papers that use JAX-based envs or algorithms. Docs: jax.readthedocs.io. ...

March 10, 2026 · 4 min · 654 words · codefrydev

How to Install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow

Learning objectives Install NumPy, SciPy, Matplotlib, Pandas, and IPython (or Jupyter) for the curriculum. Optionally install Theano or TensorFlow if you follow exercises that use them; the curriculum primarily uses PyTorch for deep RL. Core libraries (required for early volumes) NumPy: pip install numpy Used for arrays, random numbers, and numerical operations in bandits, MDPs, and tabular methods. Matplotlib: pip install matplotlib Used for plotting learning curves, value functions, and heatmaps. ...

March 10, 2026 · 2 min · 279 words · codefrydev