Advantage
Overall Progress
0%
Dueling architecture V(s) + A(s,a); compare with DQN.
Sketch two-network actor-critic; pseudocode for TD error updates.
Dueling architecture V(s) + A(s,a); compare with DQN.
Sketch two-network actor-critic; pseudocode for TD error updates.