1. Simultaneous task and energy planning using deep reinforcement learning.
- Author
-
Wang, Di, Hu, Mengqi, and Weir, Jeffery D.
- Subjects
- *
REINFORCEMENT learning , *ENERGY consumption , *PERCEIVED control (Psychology) , *COMBINATORIAL optimization , *HEURISTIC algorithms , *MACHINE learning - Abstract
• Three models are developed to study simultaneous task and energy planning problem for autonomous vehicles. • A novel neural optimization algorithm using deep reinforcement learning and a link information filter is proposed. • An end-to-end learning framework that directly maps from perceptions to control decisions is proposed. • The proposed neural optimizer can find near-optimal solutions very fast compared to exact and heuristic algorithms. To improve energy awareness of unmanned autonomous vehicles, it is critical to co-optimize task planning and energy scheduling. To the best of our knowledge, most of the existing task planning algorithms either ignore energy constraints or make energy scheduling decisions based on simple rules. To bridge these research gaps, we propose a combinatorial optimization model for the simultaneous task and energy planning (STEP) problem. In this paper, we propose three variants of STEP problems (i) the vehicle can visit stationary charging stations multiple times at various locations; (ii) the vehicle can efficiently coordinate with mobile charging stations to achieve zero waiting time for recharging, and (iii) the vehicle can maximally harvest solar energy by considering time variance in solar irradiance. Besides, in order to obtain fast and reliable solutions to STEP problems, we propose a neural combinatorial optimizer using the deep reinforcement learning algorithm with a proposed link information filter. The near-optimal solutions can be obtained very fast without solving the problem from scratch when environments change. Our simulation results demonstrate that (i) our proposed neural optimizer can find solutions close to the optimum and outperform the exact and heuristic algorithms in terms of computational cost; (ii) the end-to-end learning (directly mapping from perceptions to control) model outperforms the traditional learning (mapping from perception to prediction to control) model. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF