Chapter 78: Adversarial Motion Priors (AMP)
Learning objectives Read the AMP paper and explain how it combines a task reward (e.g. velocity tracking, goal reaching) with an adversarial style reward (discriminator that scores motion similarity to reference data). Write the combined reward function: r = r_task + λ r_style, where r_style comes from a discriminator trained to distinguish agent motion from reference (e.g. motion capture) data. Identify why adding a style reward helps produce natural-looking and robust locomotion compared to task-only reward. Relate AMP to robot navigation and game AI (character animation) where we want both task success and natural motion. Concept and real-world RL ...