Back to Search
Start Over
Stabilising experience replay for deep multi-agent reinforcement learning
- Publication Year :
- 2017
- Publisher :
- PMLR, 2017.
-
Abstract
- Many real-world problems, such as network packet routing and urban traffic control, are naturally modeled as multi-agent reinforcement learning (RL) problems. However, existing multi-agent RL methods typically scale poorly in the problem size. Therefore, a key challenge is to translate the success of deep learning on singleagent RL to the multi-agent setting. A key stumbling block is that independent Q-learning, the most popular multi-agent RL method, introduces nonstationarity that makes it incompatible with the experience replay memory on which deep RL relies. This paper proposes two methods that address this problem: 1) conditioning each agent’s value function on a footprint that disambiguates the age of the data sampled from the replay memory and 2) using a multi-agent variant of importance sampling to naturally decay obsolete data. Results on a challenging decentralised variant of StarCraft unit micromanagement confirm that these methods enable the successful combination of experience replay with multi-agent RL.
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.od......1064..cfb5955041e9409b56b892f561ffddd6