MARL
Overall Progress
0%
Model Rock-Paper-Scissors as Dec-POMDP.
Nash equilibrium of 2×2 matrix; independent learning outcome.
Explain CTDE with example; why it helps non-stationarity.
Agents output message + action; train for coordination task.
Model Rock-Paper-Scissors as Dec-POMDP.
Nash equilibrium of 2×2 matrix; independent learning outcome.
Explain CTDE with example; why it helps non-stationarity.
Agents output message + action; train for coordination task.