Phase 3 Foundations Quiz

Use this quiz after completing Volume 1 and Volume 2 (or the Phase 3 mini-project). If you can answer at least 12 of 15 correctly, you are ready for Phase 4 and Volume 3. 1. RL framework Q: Name the four main components of an RL system (agent, environment, and two more). What is a state? Answer Agent, environment, action, reward. State: a representation of the current situation the agent uses to choose actions. 2. Return Q: For rewards [0, 0, 1] and \(\gamma = 0.9\), compute the discounted return \(G_0\) from step 0. ...

March 10, 2026 · 5 min · 876 words · codefrydev

Value Functions and Bellman Equation

This page covers value functions and the Bellman equation you need for the preliminary assessment: state-value \(V^\pi(s)\), action-value \(Q^\pi(s,a)\), and the Bellman expectation equation for \(V^\pi\). Back to Preliminary. Why this matters for RL Value functions are the expected return from a state (or state-action pair) under a policy. They are the main object we estimate in value-based methods (e.g. TD, Q-learning) and appear in actor-critic as the critic. The Bellman equation is the recursive identity that connects the value at one state to immediate reward and values at successor states; it is the basis of dynamic programming and TD learning. ...

March 10, 2026 · 5 min · 906 words · codefrydev