1. Learning to Delay in Ride-Sourcing Systems: A Multi-Agent Deep Reinforcement Learning Framework
- Author
-
Jintao Ke, Hai Yang, Jieping Ye, and Feng Xiao
- Subjects
Matching (statistics) ,Computer science ,business.industry ,Q-learning ,02 engineering and technology ,Computer Science Applications ,Computational Theory and Mathematics ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Key (cryptography) ,Bipartite graph ,Combinatorial optimization ,Reinforcement learning ,Artificial intelligence ,business ,Information Systems - Abstract
Online matching between idle drivers and waiting passengers is one of the most key components in a ride-sourcing system. It is naturally expected that a more effective bipartite matching can be implemented if the platform accumulates more idle drivers and waiting passengers in the matching pool. A specific passenger request can also benefit from a delayed matching since he/she may be matched with closer idle drivers after waiting for a few seconds. Motivated by the potential benefits of delayed matching, this paper establishes a two-stage framework which incorporates a combinatorial optimization and multi-agent deep reinforcement learning methods. The multi-agent reinforcement learning methods are used to dynamically determine the delayed time for each passenger request, while the combinatorial optimization conducts an optimal bipartite matching between idle drivers and waiting passengers in the matching pool. Four tailored reinforcement learning methods, delayed multi-agent deep Q learning (Delayed-M-DQN), delayed multi-agent actor-critic (Delayed-M-A2C), delayed multi-agent Proximal Policy Optimization (Delayed-M-PPO), and delayed multi-agent actor-critic with experience replay (Delayed-M-ACER), are developed. Through extensive empirical experiments with a well-designed simulator, we show that the proposed framework is able to remarkably improve system performances, by well balancing the trade-off among pick-up time, matching time, successful matching rate.
- Published
- 2022