Deep Reinforcement Learning (module view)
March 24, 2026 · 1 min · 34 words · codefrydev
Biological Inspiration: From Brain Neurons to Artificial Neurons
March 20, 2026 · 5 min · 885 words · codefrydev
What is Machine Learning?
March 20, 2026 · 5 min · 890 words · codefrydev
Datasets and Features
March 20, 2026 · 5 min · 885 words · codefrydev
Statistics for RL
March 20, 2026 · 10 min · 1928 words · codefrydev
The Perceptron: Learning from Mistakes
March 20, 2026 · 5 min · 920 words · codefrydev
Activation Functions: Adding Non-Linearity
March 20, 2026 · 5 min · 873 words · codefrydev
Linear Regression
March 20, 2026 · 5 min · 878 words · codefrydev
Checkpoint: ML Foundations Mid-Point
March 20, 2026 · 2 min · 396 words · codefrydev
Gradient Descent
March 20, 2026 · 5 min · 890 words · codefrydev
Multi-Layer Perceptrons: Stacking Layers to Break Linearity
March 20, 2026 · 4 min · 825 words · codefrydev
Forward Propagation: Computing the Network Output
March 20, 2026 · 5 min · 869 words · codefrydev
Multiple Regression
March 20, 2026 · 5 min · 875 words · codefrydev
Phase 4 Assessment: Machine Learning Foundations
March 20, 2026 · 6 min · 1167 words · codefrydev
Checkpoint: DL Foundations Mid-Point
March 20, 2026 · 3 min · 460 words · codefrydev
Classification Concepts
March 20, 2026 · 4 min · 785 words · codefrydev
Loss Functions: Measuring How Wrong the Network Is
March 20, 2026 · 4 min · 773 words · codefrydev
Backpropagation: Teaching Networks by Propagating Errors
March 20, 2026 · 5 min · 930 words · codefrydev
Logistic Regression
March 20, 2026 · 5 min · 866 words · codefrydev
Phase 5 Assessment: Deep Learning Foundations
March 20, 2026 · 7 min · 1326 words · codefrydev
Model Evaluation
March 20, 2026 · 4 min · 745 words · codefrydev
Optimizers: SGD, Momentum, and Adam
March 20, 2026 · 4 min · 754 words · codefrydev
Cross-Validation and Overfitting
March 20, 2026 · 4 min · 752 words · codefrydev
The Training Loop
March 20, 2026 · 3 min · 624 words · codefrydev
K-Nearest Neighbors
March 20, 2026 · 3 min · 625 words · codefrydev
Regularization and Overfitting
March 20, 2026 · 4 min · 659 words · codefrydev
CNN Basics: Convolutions and Pooling
March 20, 2026 · 4 min · 721 words · codefrydev
Decision Trees
March 20, 2026 · 4 min · 741 words · codefrydev
K-Means Clustering
March 20, 2026 · 4 min · 647 words · codefrydev
PyTorch: Building Neural Networks with nn.Module
March 20, 2026 · 5 min · 988 words · codefrydev
DL Mini-Project: Digits Classifier in NumPy
March 20, 2026 · 3 min · 549 words · codefrydev
Scikit-Learn Workflow
March 20, 2026 · 3 min · 587 words · codefrydev
ML Mini-Project: Wine Classification
March 20, 2026 · 3 min · 628 words · codefrydev
DL Foundations Drills
March 20, 2026 · 5 min · 1023 words · codefrydev
ML Foundations Drills
March 20, 2026 · 5 min · 1043 words · codefrydev
DL Foundations Review & Bridge to RL
March 20, 2026 · 4 min · 819 words · codefrydev
ML Foundations Review & Bridge to Deep Learning
March 20, 2026 · 4 min · 785 words · codefrydev
Phase 0 Assessment: Python Basics
March 19, 2026 · 3 min · 564 words · codefrydev
Python Confidence Builder
March 19, 2026 · 13 min · 2600 words · codefrydev
RL in Plain English
March 19, 2026 · 10 min · 2019 words · codefrydev
Bridge Exercises: Python + Math + RL
March 19, 2026 · 10 min · 1983 words · codefrydev
Checkpoint: Volume 1, Midpoint (After Chapter 5)
March 19, 2026 · 2 min · 304 words · codefrydev
Checkpoint: Volume 2, Midpoint (After Chapter 15)
March 19, 2026 · 3 min · 479 words · codefrydev
How to Debug RL Code
March 19, 2026 · 7 min · 1308 words · codefrydev
Checkpoint: Volume 3, Midpoint (After Chapter 25)
March 19, 2026 · 3 min · 590 words · codefrydev
How to Read RL Papers
March 19, 2026 · 4 min · 851 words · codefrydev
Checkpoint: Volume 4, Midpoint (After Chapter 35)
March 19, 2026 · 3 min · 570 words · codefrydev
Checkpoint: Volume 5, Midpoint (After Chapter 45)
March 19, 2026 · 4 min · 648 words · codefrydev
Phase 8 Assessment: Advanced RL
March 19, 2026 · 6 min · 1195 words · codefrydev
Reinforcement learning glossary — terms, definitions, and chapter links
March 19, 2026 · 13 min · 2585 words · codefrydev
Volume 1 Drills — Mathematical Foundations
March 19, 2026 · 6 min · 1121 words · codefrydev
Volume 2 Drills — Tabular Model-Free Methods
March 19, 2026 · 7 min · 1404 words · codefrydev
Volume 3 Drills — Function Approximation & DQN
March 19, 2026 · 8 min · 1595 words · codefrydev
Volume 1 Review & Bridge to Volume 2
March 19, 2026 · 3 min · 608 words · codefrydev
Volume 2 Review & Bridge to Volume 3
March 19, 2026 · 4 min · 663 words · codefrydev
Volume 3 Review & Bridge to Volume 4
March 19, 2026 · 2 min · 350 words · codefrydev
Volume 4 Review & Bridge to Volume 5
March 19, 2026 · 3 min · 467 words · codefrydev
Volume 5 Review & Bridge to Volume 6
March 19, 2026 · 3 min · 489 words · codefrydev
Volume 6 Review & Bridge to Volume 7
March 19, 2026 · 3 min · 498 words · codefrydev
Volume 7 Review & Bridge to Volume 8
March 19, 2026 · 3 min · 569 words · codefrydev
Volume 8 Review & Bridge to Volume 9
March 19, 2026 · 3 min · 538 words · codefrydev
Volume 9 Review & Bridge to Volume 10
March 19, 2026 · 3 min · 532 words · codefrydev
Chapter 1: The Reinforcement Learning Framework
March 10, 2026 · 5 min · 902 words · codefrydev
Course Outline
March 10, 2026 · 6 min · 1175 words · codefrydev
Is this for Beginners or Experts? Academic or Practical? Fast or slow-paced?
March 10, 2026 · 2 min · 378 words · codefrydev
Probability & Statistics
March 10, 2026 · 13 min · 2558 words · codefrydev
Python basics for RL and the preliminary assessment
March 10, 2026 · 5 min · 853 words · codefrydev
Chapter 2: Multi-Armed Bandits
March 10, 2026 · 4 min · 807 words · codefrydev
How to Succeed in this Course (Long Version)
March 10, 2026 · 2 min · 406 words · codefrydev
NumPy
March 10, 2026 · 4 min · 793 words · codefrydev
Phase 1 Self-Check: Math for RL
March 10, 2026 · 5 min · 858 words · codefrydev
Bandits: Optimistic Initial Values
March 10, 2026 · 2 min · 305 words · codefrydev
Chapter 3: Markov Decision Processes (MDPs)
March 10, 2026 · 4 min · 781 words · codefrydev
Effective Learning Strategies for Machine Learning
March 10, 2026 · 2 min · 292 words · codefrydev
Linear Algebra
March 10, 2026 · 13 min · 2571 words · codefrydev
Phase 2 Readiness Quiz
March 10, 2026 · 4 min · 656 words · codefrydev
Probability & Statistics
March 10, 2026 · 5 min · 1062 words · codefrydev
Bandits: UCB1
March 10, 2026 · 2 min · 319 words · codefrydev
Calculus
March 10, 2026 · 11 min · 2332 words · codefrydev
Chapter 4: The Reward Hypothesis
March 10, 2026 · 4 min · 806 words · codefrydev
Gridworld
March 10, 2026 · 2 min · 356 words · codefrydev
Linear Algebra
March 10, 2026 · 5 min · 922 words · codefrydev
Machine Learning and AI Prerequisite Roadmap (pt 1–2)
March 10, 2026 · 2 min · 320 words · codefrydev
Anaconda Environment Setup
March 10, 2026 · 2 min · 237 words · codefrydev
Bandits: Thompson Sampling
March 10, 2026 · 2 min · 401 words · codefrydev
Calculus
March 10, 2026 · 4 min · 793 words · codefrydev
Chapter 5: Value Functions
March 10, 2026 · 4 min · 724 words · codefrydev
Choosing Rewards
March 10, 2026 · 2 min · 354 words · codefrydev
Bandits: Nonstationary
March 10, 2026 · 2 min · 363 words · codefrydev
Chapter 6: The Bellman Equations
March 10, 2026 · 4 min · 688 words · codefrydev
RL Framework
March 10, 2026 · 6 min · 1198 words · codefrydev
Setting Up Your Environment
March 10, 2026 · 2 min · 229 words · codefrydev
Bandits: Why don’t we just use a library?
March 10, 2026 · 2 min · 289 words · codefrydev
Chapter 7: Dynamic Programming — Policy Evaluation
March 10, 2026 · 4 min · 811 words · codefrydev
How to Install Numpy, Scipy, Matplotlib, Pandas, IPython, Theano, and TensorFlow
March 10, 2026 · 2 min · 279 words · codefrydev
Tabular Methods
March 10, 2026 · 6 min · 1277 words · codefrydev
Chapter 8: Dynamic Programming — Policy Iteration
March 10, 2026 · 4 min · 762 words · codefrydev
How to Code by Yourself (part 1)
March 10, 2026 · 2 min · 312 words · codefrydev
Value Functions and Bellman Equation
March 10, 2026 · 5 min · 906 words · codefrydev
Windy Gridworld
March 10, 2026 · 2 min · 392 words · codefrydev
Chapter 9: Dynamic Programming — Value Iteration
March 10, 2026 · 4 min · 733 words · codefrydev
Dynamic Programming: Gridworld in Code
March 10, 2026 · 2 min · 390 words · codefrydev
Function Approximation and Deep RL
March 10, 2026 · 7 min · 1400 words · codefrydev
How to Code by Yourself (part 2)
March 10, 2026 · 2 min · 346 words · codefrydev
Chapter 10: Limitations of Dynamic Programming
March 10, 2026 · 4 min · 826 words · codefrydev
Phase 6 Assessment: RL Foundations
March 10, 2026 · 5 min · 876 words · codefrydev
Python
March 10, 2026 · 9 min · 1810 words · codefrydev
PyTorch Basics
March 10, 2026 · 5 min · 926 words · codefrydev
Chapter 11: Monte Carlo Methods
March 10, 2026 · 5 min · 895 words · codefrydev
Final Self-Assessment
March 10, 2026 · 3 min · 448 words · codefrydev
Chapter 12: Temporal Difference (TD) Learning
March 10, 2026 · 4 min · 744 words · codefrydev
Monte Carlo in Code
March 10, 2026 · 3 min · 464 words · codefrydev
Chapter 13: SARSA (On-Policy TD Control)
March 10, 2026 · 3 min · 639 words · codefrydev
Phase 7 Assessment: Deep RL
March 10, 2026 · 4 min · 814 words · codefrydev
TD, SARSA, and Q-Learning in Code
March 10, 2026 · 2 min · 351 words · codefrydev
Chapter 14: Q-Learning (Off-Policy TD Control)
March 10, 2026 · 4 min · 700 words · codefrydev
Chapter 15: Expected SARSA
March 10, 2026 · 4 min · 708 words · codefrydev
Chapter 16: N-Step Bootstrapping
March 10, 2026 · 4 min · 653 words · codefrydev
Chapter 17: Planning and Learning with Tabular Methods
March 10, 2026 · 4 min · 686 words · codefrydev
Chapter 18: Custom Gym Environments (Part 1)
March 10, 2026 · 4 min · 663 words · codefrydev
Chapter 19: Hyperparameter Tuning in Tabular RL
March 10, 2026 · 4 min · 713 words · codefrydev
Chapter 20: The Limits of Tabular Methods
March 10, 2026 · 4 min · 761 words · codefrydev
NumPy
March 10, 2026 · 7 min · 1399 words · codefrydev
Chapter 21: Linear Function Approximation
March 10, 2026 · 4 min · 713 words · codefrydev
Feature Engineering for Reinforcement Learning
March 10, 2026 · 2 min · 400 words · codefrydev
CartPole
March 10, 2026 · 3 min · 451 words · codefrydev
Chapter 22: Artificial Neural Networks for RL
March 10, 2026 · 4 min · 655 words · codefrydev
Chapter 23: Deep Q-Networks (DQN)
March 10, 2026 · 4 min · 652 words · codefrydev
Chapter 24: Experience Replay
March 10, 2026 · 4 min · 715 words · codefrydev
Chapter 25: Target Networks
March 10, 2026 · 4 min · 699 words · codefrydev
Chapter 26: Double DQN (DDQN)
March 10, 2026 · 3 min · 630 words · codefrydev
Chapter 27: Dueling DQN
March 10, 2026 · 4 min · 693 words · codefrydev
Chapter 28: Prioritized Experience Replay (PER)
March 10, 2026 · 4 min · 747 words · codefrydev
Chapter 29: Noisy Networks for Exploration
March 10, 2026 · 4 min · 760 words · codefrydev
Chapter 30: Rainbow DQN
March 10, 2026 · 4 min · 693 words · codefrydev
Pandas
March 10, 2026 · 4 min · 764 words · codefrydev
Chapter 31: Introduction to Policy-Based Methods
March 10, 2026 · 4 min · 678 words · codefrydev
Chapter 32: The Policy Objective Function
March 10, 2026 · 4 min · 713 words · codefrydev
Chapter 33: The REINFORCE Algorithm
March 10, 2026 · 4 min · 720 words · codefrydev
Chapter 34: Reducing Variance in Policy Gradients
March 10, 2026 · 4 min · 715 words · codefrydev
Chapter 35: Actor-Critic Architectures
March 10, 2026 · 4 min · 690 words · codefrydev
Visualization & Plotting for RL
March 10, 2026 · 6 min · 1180 words · codefrydev
Chapter 36: Advantage Actor-Critic (A2C)
March 10, 2026 · 4 min · 689 words · codefrydev
Chapter 37: Asynchronous Advantage Actor-Critic (A3C)
March 10, 2026 · 4 min · 673 words · codefrydev
Chapter 38: Continuous Action Spaces
March 10, 2026 · 4 min · 689 words · codefrydev
Chapter 39: Deep Deterministic Policy Gradient (DDPG)
March 10, 2026 · 4 min · 643 words · codefrydev
Chapter 40: Twin Delayed DDPG (TD3)
March 10, 2026 · 4 min · 668 words · codefrydev
Matplotlib
March 10, 2026 · 5 min · 969 words · codefrydev
Chapter 41: The Problem with Standard Policy Gradients
March 10, 2026 · 4 min · 738 words · codefrydev
Chapter 42: Trust Region Policy Optimization (TRPO)
March 10, 2026 · 4 min · 682 words · codefrydev
Chapter 43: Proximal Policy Optimization (PPO): Intuition
March 10, 2026 · 4 min · 666 words · codefrydev
Chapter 44: PPO: Implementation Details
March 10, 2026 · 4 min · 649 words · codefrydev
Chapter 45: Coding PPO from Scratch
March 10, 2026 · 4 min · 656 words · codefrydev
Chapter 46: Maximum Entropy RL
March 10, 2026 · 4 min · 660 words · codefrydev
Chapter 47: Soft Actor-Critic (SAC)
March 10, 2026 · 3 min · 631 words · codefrydev
Chapter 48: SAC vs. PPO
March 10, 2026 · 3 min · 619 words · codefrydev
Chapter 49: Custom Gym Environments (Part 2)
March 10, 2026 · 4 min · 649 words · codefrydev
Chapter 50: Advanced Hyperparameter Tuning
March 10, 2026 · 3 min · 604 words · codefrydev
PyTorch
March 10, 2026 · 5 min · 1052 words · codefrydev
Chapter 51: Model-Free vs. Model-Based RL
March 10, 2026 · 3 min · 552 words · codefrydev
Chapter 52: Learning World Models
March 10, 2026 · 3 min · 542 words · codefrydev
Chapter 53: Planning with Known Models
March 10, 2026 · 3 min · 550 words · codefrydev
Chapter 54: Monte Carlo Tree Search (MCTS)
March 10, 2026 · 3 min · 559 words · codefrydev
Chapter 55: AlphaZero Architecture
March 10, 2026 · 3 min · 563 words · codefrydev
Chapter 56: MuZero Intuition
March 10, 2026 · 3 min · 572 words · codefrydev
Chapter 57: Dreamer and Latent Imagination
March 10, 2026 · 3 min · 571 words · codefrydev
Chapter 58: Model-Based Policy Optimization (MBPO)
March 10, 2026 · 3 min · 584 words · codefrydev
Chapter 59: Probabilistic Ensembles with Trajectory Sampling (PETS)
March 10, 2026 · 3 min · 593 words · codefrydev
Chapter 60: Visualizing Model-Based Rollouts
March 10, 2026 · 3 min · 584 words · codefrydev
TensorFlow
March 10, 2026 · 5 min · 1051 words · codefrydev
Chapter 61: The Hard Exploration Problem
March 10, 2026 · 3 min · 590 words · codefrydev
Chapter 62: Intrinsic Motivation
March 10, 2026 · 3 min · 588 words · codefrydev
Chapter 63: Curiosity-Driven Exploration (ICM)
March 10, 2026 · 4 min · 744 words · codefrydev
Chapter 64: Random Network Distillation (RND)
March 10, 2026 · 4 min · 746 words · codefrydev
Chapter 65: Count-Based Exploration
March 10, 2026 · 4 min · 757 words · codefrydev
Chapter 66: Go-Explore Algorithm
March 10, 2026 · 5 min · 877 words · codefrydev
Chapter 67: Meta-Learning (Learning to Learn)
March 10, 2026 · 4 min · 833 words · codefrydev
Chapter 68: Model-Agnostic Meta-Learning (MAML) in RL
March 10, 2026 · 4 min · 751 words · codefrydev
Chapter 69: RL² (Reinforcement Learning as an RNN)
March 10, 2026 · 4 min · 833 words · codefrydev
Chapter 70: Unsupervised Environment Design
March 10, 2026 · 5 min · 861 words · codefrydev
OpenAI Gym / Gymnasium
March 10, 2026 · 6 min · 1082 words · codefrydev
Chapter 71: The Offline RL Problem
March 10, 2026 · 4 min · 845 words · codefrydev
Chapter 72: Conservative Q-Learning (CQL)
March 10, 2026 · 4 min · 804 words · codefrydev
Chapter 73: Decision Transformers
March 10, 2026 · 4 min · 843 words · codefrydev
Chapter 74: Introduction to Imitation Learning
March 10, 2026 · 4 min · 742 words · codefrydev
Chapter 75: Limitations of Behavioral Cloning
March 10, 2026 · 5 min · 941 words · codefrydev
Chapter 76: Inverse Reinforcement Learning (IRL)
March 10, 2026 · 5 min · 882 words · codefrydev
Chapter 77: Generative Adversarial Imitation Learning (GAIL)
March 10, 2026 · 4 min · 828 words · codefrydev
Chapter 78: Adversarial Motion Priors (AMP)
March 10, 2026 · 4 min · 850 words · codefrydev
Chapter 79: Offline-to-Online Finetuning
March 10, 2026 · 5 min · 881 words · codefrydev
Chapter 80: RL from Human Feedback (RLHF) Basics
March 10, 2026 · 4 min · 851 words · codefrydev
Other Libraries
March 10, 2026 · 4 min · 803 words · codefrydev
Chapter 81: Multi-Agent Fundamentals
March 10, 2026 · 4 min · 796 words · codefrydev
Chapter 82: Game Theory Basics for RL
March 10, 2026 · 4 min · 805 words · codefrydev
Chapter 83: Independent Q-Learning (IQL)
March 10, 2026 · 4 min · 851 words · codefrydev
Chapter 84: Centralized Training, Decentralized Execution (CTDE)
March 10, 2026 · 5 min · 891 words · codefrydev
Chapter 85: Multi-Agent DDPG (MADDPG)
March 10, 2026 · 4 min · 792 words · codefrydev
Chapter 86: Value Decomposition Networks (VDN)
March 10, 2026 · 4 min · 812 words · codefrydev
Chapter 87: QMIX Algorithm
March 10, 2026 · 4 min · 804 words · codefrydev
Chapter 88: Multi-Agent PPO (MAPPO)
March 10, 2026 · 4 min · 811 words · codefrydev
Chapter 89: Self-Play and League Training
March 10, 2026 · 5 min · 887 words · codefrydev
Chapter 90: Communication in MARL
March 10, 2026 · 5 min · 873 words · codefrydev
Chapter 91: RL in Robotics
March 10, 2026 · 4 min · 792 words · codefrydev
Chapter 92: Safe Reinforcement Learning
March 10, 2026 · 4 min · 849 words · codefrydev
Chapter 93: RL for Algorithmic Trading
March 10, 2026 · 4 min · 775 words · codefrydev
Chapter 94: RL in Recommender Systems
March 10, 2026 · 4 min · 826 words · codefrydev
Chapter 95: Training Large Language Models with PPO
March 10, 2026 · 5 min · 878 words · codefrydev
Chapter 96: Implementing RLHF in NLP
March 10, 2026 · 4 min · 851 words · codefrydev
Chapter 97: Direct Preference Optimization (DPO)
March 10, 2026 · 4 min · 823 words · codefrydev
Chapter 98: Evaluating RL Agents
March 10, 2026 · 5 min · 854 words · codefrydev
Chapter 99: Debugging RL Code
March 10, 2026 · 5 min · 865 words · codefrydev
Chapter 100: The Future of Reinforcement Learning
March 10, 2026 · 5 min · 874 words · codefrydev
How to Succeed in this Course
March 10, 2026 · 1 min · 208 words · codefrydev
Real-World Scenarios in This Curriculum
March 10, 2026 · 3 min · 563 words · codefrydev
Stock Trading Project with Reinforcement Learning
March 10, 2026 · 4 min · 717 words · codefrydev
This Course vs. RL Book: What’s the Difference?
March 10, 2026 · 2 min · 405 words · codefrydev
Where to Get the Code
March 10, 2026 · 2 min · 240 words · codefrydev
Worked Solutions Index
March 10, 2026 · 2 min · 285 words · codefrydev