1. Deep reinforcement learning based multi-level dynamic reconfiguration for urban distribution network: a cloud-edge collaboration architecture.
- Author
-
Siyuan Jiang, Hongjun Gao, Xiaohui Wang, Junyong Liu, and Kunyu Zuo
- Subjects
- *
REINFORCEMENT learning , *DEEP learning , *INTERNET of things , *CLOUD computing , *DISTANCE education - Abstract
With the construction of the power Internet of Things (IoT), communication between smart devices in urban distribution networks has been gradually moving towards high speed, high compatibility, and low latency, which provides reliable support for reconfiguration optimization in urban distribution networks. Thus, this study proposed a deep reinforcement learning based multi-level dynamic reconfiguration method for urban distribution networks in a cloud-edge collaboration architecture to obtain a real-time optimal multi-level dynamic reconfiguration solution. First, the multi-level dynamic reconfiguration method was discussed, which included feeder-, transformer-, and substation-levels. Subsequently, the multi-agent system was combined with the cloud-edge collaboration architecture to build a deep reinforcement learning model for multi-level dynamic reconfiguration in an urban distribution network. The cloud-edge collaboration architecture can effectively support the multi-agent system to conduct "centralized training and decentralized execution" operation modes and improve the learning efficiency of the model. Thereafter, for a multi-agent system, this study adopted a combination of offline and online learning to endow the model with the ability to realize automatic optimization and updation of the strategy. In the offline learning phase, a Q-learning-based multi-agent conservative Q-learning (MACQL) algorithm was proposed to stabilize the learning results and reduce the risk of the next online learning phase. In the online learning phase, a multiagent deep deterministic policy gradient (MADDPG) algorithm based on policy gradients was proposed to explore the action space and update the experience pool. Finally, the effectiveness of the proposed method was verified through a simulation analysis of a real-world 445-node system. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF