Chapter 49: Custom Gym Environments (Part 2)
Learning objectives Create a custom Gym environment: a 2D point mass that must navigate to a goal while avoiding an obstacle. Define continuous action (e.g. force in x and y) and a reward function (e.g. distance to goal, penalty for obstacle or boundary). Test the environment with a SAC (or PPO) agent and verify that the agent can learn to reach the goal. Concept and real-world RL Custom environments let you model robot navigation, recommendation (state = user, action = item), or trading (state = market, action = trade). A 2D point mass is a minimal continuous control task: state = (x, y, vx, vy), action = (fx, fy), reward = -distance to goal + penalties. In robot control, similar point-mass or particle models are used for planning and RL; in game AI, custom envs are used for prototyping. Implementing the Gym interface (reset, step, observation_space, action_space) and testing with a known algorithm (SAC) validates the design. ...