Back to Search
Start Over
Causal inference multi-agent reinforcement learning for traffic signal control.
- Source :
-
Information Fusion . Jun2023, Vol. 94, p243-256. 14p. - Publication Year :
- 2023
-
Abstract
- • A Causal-Inference (CI) model is designed for the non-stationary multi-agent environment. • Combining with Multi-Agent learning, a CI-MA algorithm is proposed for traffic signal control. • Different granularity of traffic information is fused for feature representation. • A representation loss function and MA loss function are designed for joint optimization. • Experiments show that CI-MA algorithm outperforms the state-of-the-art algorithms. A primary challenge in multi-agent reinforcement learning for traffic signal control is to produce effective cooperative traffic-signal policies in non-stationary multi-agent traffic environments. However, each agent suffers from its local non-stationary traffic environment caused by the time-varying traffic-signal policies of adjacent agents; At the same time, different agents also produce time-varying traffic-signal policies, which further results in the non-stationarity of the whole traffic environment, so these produced traffic-signal policies may be ineffective. In this work, we propose a Causal Inference Multi-Agent reinforcement learning (CI-MA) algorithm, which can alleviate the non-stationarity of multi-agent traffic environments from both feature representation and optimization, eventually helps to produce effective cooperative traffic-signal policies. Specifically, a Causal-Inference (CI) model is first designed to reason about and tackle the non-stationarity of multi-agent traffic environments by both acquiring feature representation distributions and deriving variational lower bounds (i.e., objective functions); And then, based on the designed CI model, we propose a CI-MA algorithm, in which the feature representations are acquired from the non-stationarity of multi-agent traffic environments at both task level and timestep level, the acquired feature representations are used to produce cooperative traffic-signal policies and Q-values for multiple agents; Finally the corresponding objective functions optimize the whole algorithm from both causal inference and multi-agent reinforcement learning. Experiments are conducted in different non-stationary multi-agent traffic environments. Results show that CI-MA algorithm outperforms other state-of-the-art algorithms, and demonstrate that the proposed algorithm trained in synthetic-traffic environments can be effectively transferred to both synthetic- and real-traffic environments with non-stationarity. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 15662535
- Volume :
- 94
- Database :
- Academic Search Index
- Journal :
- Information Fusion
- Publication Type :
- Academic Journal
- Accession number :
- 162028590
- Full Text :
- https://doi.org/10.1016/j.inffus.2023.02.009