Imitation

Overall Progress 0%

Discriminator expert vs agent; use as reward for policy gradient.

Go to Chapter 77: Generative Adversarial Imitation Learning (GAIL) →

AMP paper: task reward + adversarial style reward; combined reward.

Go to Chapter 78: Adversarial Motion Priors (AMP) →