1,442 results
Search Results
2. Fair Deep Reinforcement Learning with Generalized Gini Welfare Functions
- Author
-
Yu, Guanbao, Siddique, Umer, Weng, Paul, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Amigoni, Francesco, editor, and Sinha, Arunesh, editor
- Published
- 2024
- Full Text
- View/download PDF
3. Sequence-Based Deep Reinforcement Learning for Task Offloading in Mobile Edge Computing: A Comparison Study
- Author
-
Xiao, Xiang-Jie, Wang, Yong, Wang, Kezhi, Huang, Pei-Qiu, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Pan, Linqiang, editor, Wang, Yong, editor, and Lin, Jianqing, editor
- Published
- 2024
- Full Text
- View/download PDF
4. Ökolopoly: Case Study on Large Action Spaces in Reinforcement Learning
- Author
-
Engelhardt, Raphael C., Raycheva, Ralitsa, Lange, Moritz, Wiskott, Laurenz, Konen, Wolfgang, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nicosia, Giuseppe, editor, Ojha, Varun, editor, La Malfa, Emanuele, editor, La Malfa, Gabriele, editor, Pardalos, Panos M., editor, and Umeton, Renato, editor
- Published
- 2024
- Full Text
- View/download PDF
5. Deep Reinforcement Learning Based Intelligent Resource Allocation Techniques with Applications to Cloud Computing
- Author
-
Kaur, Ramanpreet, Anand, Divya, Kaur, Upinder, Kaur, Jaskiran, Verma, Sahil, Kavita, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Santosh, KC, editor, Makkar, Aaisha, editor, Conway, Myra, editor, Singh, Ashutosh K., editor, Vacavant, Antoine, editor, Abou el Kalam, Anas, editor, Bouguelia, Mohamed-Rafik, editor, and Hegadi, Ravindra, editor
- Published
- 2024
- Full Text
- View/download PDF
6. Arterial Traffic Optimization Algorithm Based on Deep Reinforcement Learning and Green Wave Coordination Control in Complex Lane Queuing Conditions
- Author
-
Wang, Tong, Liu, Songming, Chen, Liwei, Ouyang, Min, Gao, Shan, Zhang, Yingxue, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Yuqing, editor, Lu, Tun, editor, Wang, Tong, editor, Fan, Hongfei, editor, Liu, Dongning, editor, and Du, Bowen, editor
- Published
- 2024
- Full Text
- View/download PDF
7. Explicit Coordination Based Multi-agent Reinforcement Learning for Intelligent Traffic Signal Control
- Author
-
Li, Yixuan, Che, Qian, Zhou, Yifeng, Wang, Wanyuan, Jiang, Yichuan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Yuqing, editor, Lu, Tun, editor, Wang, Tong, editor, Fan, Hongfei, editor, Liu, Dongning, editor, and Du, Bowen, editor
- Published
- 2024
- Full Text
- View/download PDF
8. End-to-End Automatic Parking Based on Proximal Policy Optimization Algorithm in Carla
- Author
-
Li, Zhizhao, Jiao, Longyin, Fu, Zhumu, Tao, Fazhan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Fuchun, editor, Meng, Qinghu, editor, Fu, Zhumu, editor, and Fang, Bin, editor
- Published
- 2024
- Full Text
- View/download PDF
9. Lane Change Decision Control of Autonomous Vehicle Based on A3C Algorithm
- Author
-
Zhou, Chuntao, Liao, Mingrui, Jiao, Longyin, Tao, Fazhan, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Fuchun, editor, Meng, Qinghu, editor, Fu, Zhumu, editor, and Fang, Bin, editor
- Published
- 2024
- Full Text
- View/download PDF
10. Reinforcement Learning in Algorithmic Trading: An Overview
- Author
-
Czuba, Przemysław, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, and Soliman, Khalid S., editor
- Published
- 2024
- Full Text
- View/download PDF
11. Deep Reinforcement Learning for Multiobjective Scheduling in Industry 5.0 Reconfigurable Manufacturing Systems
- Author
-
Bezoui, Madani, Kermali, Abdelfatah, Bounceur, Ahcene, Qaisar, Saeed Mian, Almaktoom, Abdulaziz Turki, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Renault, Éric, editor, Boumerdassi, Selma, editor, and Mühlethaler, Paul, editor
- Published
- 2024
- Full Text
- View/download PDF
12. Task Offloading with Dual-Mode Switching in Multi-access Edge Computing
- Author
-
Zhang, Xiaoliang, Duan, Jiaqi, Yan, Mei, Lyu, Shunming, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Jin, Hai, editor, Pan, Yi, editor, and Lu, Jianfeng, editor
- Published
- 2024
- Full Text
- View/download PDF
13. Study on LSTM and ConvLSTM Memory-Based Deep Reinforcement Learning
- Author
-
Duarte, Fernando Fradique, Lau, Nuno, Pereira, Artur, Reis, Luís Paulo, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Rocha, Ana Paula, editor, Steels, Luc, editor, and van den Herik, Jaap, editor
- Published
- 2024
- Full Text
- View/download PDF
14. ContainerGym: A Real-World Reinforcement Learning Benchmark for Resource Allocation
- Author
-
Pendyala, Abhijeet, Dettmer, Justin, Glasmachers, Tobias, Atamna, Asma, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nicosia, Giuseppe, editor, Ojha, Varun, editor, La Malfa, Emanuele, editor, La Malfa, Gabriele, editor, Pardalos, Panos M., editor, and Umeton, Renato, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Deep Reinforcement Learning for Delay and Energy-Aware Task Scheduling in Edge Clouds
- Author
-
Xun, Meng, Yao, Yan, Yu, Jiguo, Zhang, Huihui, Feng, Shanshan, Cao, Jian, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Yuqing, editor, Lu, Tun, editor, Wang, Tong, editor, Fan, Hongfei, editor, Liu, Dongning, editor, and Du, Bowen, editor
- Published
- 2024
- Full Text
- View/download PDF
16. DELCAS: Deep Reinforcement Learning Based GPU CaaS Packet Scheduling for Stabilizing QoE in 5G Multi-Access Edge Computing
- Author
-
Lee, Changha, Lee, Kyungchae, Cho, Gyusang, Youn, Chan-Hyun, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Casteleyn, Sven, editor, Mikkonen, Tommi, editor, García Simón, Alberto, editor, Ko, In-Young, editor, and Loseto, Giuseppe, editor
- Published
- 2024
- Full Text
- View/download PDF
17. An Adaptive, Energy-Efficient DRL-Based and MCMC-Based Caching Strategy for IoT Systems
- Author
-
Karras, Aristeidis, Karras, Christos, Karydis, Ioannis, Avlonitis, Markos, Sioutas, Spyros, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Chatzigiannakis, Ioannis, editor, and Karydis, Ioannis, editor
- Published
- 2024
- Full Text
- View/download PDF
18. CCA-MTFCN: A Robotic Pushing-Grasping Collaborative Method Based on Deep Reinforcement Learning
- Author
-
Xu, Haiyuan, Wang, Qi, Min, Huasong, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Sun, Fuchun, editor, Meng, Qinghu, editor, Fu, Zhumu, editor, and Fang, Bin, editor
- Published
- 2024
- Full Text
- View/download PDF
19. Reinforcement learning for cooling rate control during quenching
- Author
-
Hachem, Elie, Vishwasrao, Abhijeet, Renault, Maxime, Viquerat, Jonathan, and Meliga, P.
- Published
- 2024
- Full Text
- View/download PDF
20. An LSTM-based hybrid proximal policy optimization spectrum access algorithm in vehicular network
- Author
-
Kang, Lin, Chen, Junjie, Wang, Jie, and Wei, Yaqi
- Published
- 2024
- Full Text
- View/download PDF
21. A new concept for large additive manufacturing in construction: tower crane-based 3D printing controlled by deep reinforcement learning
- Author
-
Parisi, Fabio, Sangiorgio, Valentino, Parisi, Nicola, Mangini, Agostino M., Fanti, Maria Pia, and Adam, Jose M.
- Published
- 2024
- Full Text
- View/download PDF
22. Guest Editorial: Operational and structural resilience of power grids with high penetration of renewables.
- Author
-
Lei, Shunbo, Zhang, Yichen, Shahidehpour, Mohammad, Hou, Yunhe, Panteli, Mathaios, Chen, Xia, Aydin, Nazli Yonca, Liang, Liang, Wang, Cheng, Wang, Chong, and She, Buxin
- Subjects
MICROGRIDS ,ELECTRIC power distribution grids ,CYBER physical systems ,MIXED integer linear programming ,DEEP reinforcement learning ,ARTIFICIAL neural networks ,REINFORCEMENT learning ,ELECTRIC power - Published
- 2024
- Full Text
- View/download PDF
23. Optimization of news dissemination push mode by intelligent edge computing technology for deep learning.
- Author
-
DeGe, JiLe and Sang, Sina
- Subjects
DEEP reinforcement learning ,PATTERN recognition systems ,SOCIAL media ,NEWS websites ,RECOMMENDER systems ,DEEP learning ,REINFORCEMENT learning - Abstract
The Internet era is an era of information explosion. By 2022, the global Internet users have reached more than 4 billion, and the social media users have exceeded 3 billion. People face a lot of news content every day, and it is almost impossible to get interesting information by browsing all the news content. Under this background, personalized news recommendation technology has been widely used, but it still needs to be further optimized and improved. In order to better push the news content of interest to different readers, users' satisfaction with major news websites should be further improved. This study proposes a new recommendation algorithm based on deep learning and reinforcement learning. Firstly, the RL algorithm is introduced based on deep learning. Deep learning is excellent in processing large-scale data and complex pattern recognition, but it often faces the challenge of low sample efficiency when it comes to complex decision-making and sequential tasks. While reinforcement learning (RL) emphasizes learning optimization strategies through continuous trial and error through interactive learning with the environment. Compared with deep learning, RL is more suitable for scenes that need long-term decision-making and trial-and-error learning. By feeding back the reward signal of the action, the system can better adapt to the unknown environment and complex tasks, which makes up for the relative shortcomings of deep learning in these aspects. A scenario is applied to an action to solve the sequential decision problem in the news dissemination process. In order to enable the news recommendation system to consider the dynamic changes in users' interest in news content, the Deep Deterministic Policy Gradient algorithm is applied to the news recommendation scenario. Opposing learning complements and combines Deep Q-network with the strategic network. On the basis of fully summarizing and thinking, this paper puts forward the mode of intelligent news dissemination and push. The push process of news communication information based on edge computing technology is proposed. Finally, based on Area Under Curve a Q-Leaning Area Under Curve for RL models is proposed. This indicator can measure the strengths and weaknesses of RL models efficiently and facilitates comparing models and evaluating offline experiments. The results show that the DDPG algorithm improves the click-through rate by 2.586% compared with the conventional recommendation algorithm. It shows that the algorithm designed in this paper has more obvious advantages in accurate recommendation by users. This paper effectively improves the efficiency of news dissemination by optimizing the push mode of intelligent news dissemination. In addition, the paper also deeply studies the innovative application of intelligent edge technology in news communication, which brings new ideas and practices to promote the development of news communication methods. Optimizing the push mode of intelligent news dissemination not only improves the user experience, but also provides strong support for the application of intelligent edge technology in this field, which has important practical application prospects. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Adaptive control for circulating cooling water system using deep reinforcement learning.
- Author
-
Xu, Jin, Li, Han, and Zhang, Qingxin
- Subjects
DEEP reinforcement learning ,ADAPTIVE control systems ,COOLING systems ,WATER use ,SMART structures ,REINFORCEMENT learning - Abstract
Due to the complex internal working process of circulating cooling water systems, most traditional control methods struggle to achieve stable and precise control. Therefore, this paper presents a novel adaptive control structure for the Twin Delayed Deep Deterministic Policy Gradient algorithm, which is based on a reference trajectory model (TD3-RTM). The structure is based on the Markov decision process of the recirculating cooling water system. Initially, the TD3 algorithm is employed to construct a deep reinforcement learning agent. Subsequently, a state space is selected, and a dense reward function is designed, considering the multivariable characteristics of the recirculating cooling water system. The agent updates its network based on different reward values obtained through interactions with the system, thereby gradually aligning the action values with the optimal policy. The TD3-RTM method introduces a reference trajectory model to accelerate the convergence speed of the agent and reduce oscillations and instability in the control system. Subsequently, simulation experiments were conducted in MATLAB/Simulink. The results show that compared to PID, fuzzy PID, DDPG and TD3, the TD3-RTM method improved the transient time in the flow loop by 6.09s, 5.29s, 0.57s, and 0.77s, respectively, and the Integral of Absolute Error(IAE) indexes decreased by 710.54, 335.1, 135.97, and 89.96, respectively, and the transient time in the temperature loop improved by 25.84s, 13.65s, 15.05s, and 0.81s, and the IAE metrics were reduced by 143.9, 59.13, 31.79, and 1.77, respectively. In addition, the overshooting of the TD3-RTM method in the flow loop was reduced by 17.64, 7.79, and 1.29 per cent, respectively, in comparison with the PID, the fuzzy PID, and the TD3. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
25. USVs Path Planning for Maritime Search and Rescue Based on POS-DQN: Probability of Success-Deep Q-Network.
- Author
-
Liu, Lu, Shan, Qihe, and Xu, Qi
- Subjects
DEEP reinforcement learning ,RESCUE work ,AUTONOMOUS vehicles ,PROBLEM solving ,ALGORITHMS - Abstract
Efficient maritime search and rescue (SAR) is crucial for responding to maritime emergencies. In traditional SAR, fixed search path planning is inefficient and cannot prioritize high-probability regions, which has significant limitations. To solve the above problems, this paper proposes unmanned surface vehicles (USVs) path planning for maritime SAR based on POS-DQN so that USVs can perform SAR tasks reasonably and efficiently. Firstly, the search region is allocated as a whole using an improved task allocation algorithm so that the task region of each USV has priority and no duplication. Secondly, this paper considers the probability of success (POS) of the search environment and proposes a POS-DQN algorithm based on deep reinforcement learning. This algorithm can adapt to the complex and changing environment of SAR. It designs a probability weight reward function and trains USV agents to obtain the optimal search path. Finally, based on the simulation results, by considering the complete coverage of obstacle avoidance and collision avoidance, the search path using this algorithm can prioritize high-probability regions and improve the efficiency of SAR. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
26. An FPGA-Accelerated CNN with Parallelized Sum Pooling for Onboard Realtime Routing in Dynamic Low-Orbit Satellite Networks.
- Author
-
Kim, Hyeonwoo, Park, Juhyeon, Lee, Heoncheol, Won, Dongshik, and Han, Myonghun
- Subjects
REINFORCEMENT learning ,DEEP reinforcement learning ,CONVOLUTIONAL neural networks ,ROUTING algorithms ,GATE array circuits ,ORBITS of artificial satellites - Abstract
This paper addresses the problem of real-time onboard routing for dynamic low earth orbit (LEO) satellite networks. It is difficult to apply general routing algorithms to dynamic LEO networks due to the frequent changes in satellite topology caused by the disconnection between moving satellites. Deep reinforcement learning (DRL) models trained by various dynamic networks can be considered. However, since the inference process with the DRL model requires too long a computation time due to multiple convolutional layer operations, it is not practical to apply to a real-time on-board computer (OBC) with limited computing resources. To solve the problem, this paper proposes a practical co-design method with heterogeneous processors to parallelize and accelerate a part of the multiple convolutional layer operations on a field-programmable gate array (FPGA). The proposed method was tested with a real heterogeneous processor-based OBC and showed that the proposed method was about 3.10 times faster than the conventional method while achieving the same routing results. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Explainability in Deep Reinforcement Learning: A Review into Current Methods and Applications.
- Author
-
Hickling, Thomas, Zenati, Abdelhafid, Aouf, Nabil, and Spencer, Phillippa
- Published
- 2024
- Full Text
- View/download PDF
28. Deep reinforcement learning for adaptive frequency control of island microgrid considering control performance and economy.
- Author
-
Du, Wanlin, Huang, Xiangmin, Zhu, Yuanzhe, Wang, Ling, Deng, Wenyang, Yin, Linfei, and Saxena, Sahaj
- Subjects
DEEP reinforcement learning ,ADAPTIVE control systems ,MICROGRIDS ,REINFORCEMENT learning ,MAXIMUM entropy method ,INDEPENDENT system operators ,ADAPTIVE fuzzy control - Abstract
To achieve frequency stability and economic efficiency in isolated microgrids, grid operators face a trade-off between multiple performance indicators. This paper introduces a data-driven adaptive load frequency control (DD-ALFC) approach, where the load frequency controller is modeled as an agent that can balance different objectives autonomously. The paper also proposes a priority replay soft actor critic (PR-SAC) algorithm to implement the DD-ALFC method. The PR-SAC algorithm enhances the policy randomness by using entropy regularization and maximization, and improves the learning adaptability and generalization by using priority experience replay. The proposed DD-ALFC method based on the PR-SAC algorithm can achieve higher adaptability and robustness in complex microgrid environments with multiple performance indicators, and improve both the frequency control and the economic efficiency. The paper validates the effectiveness of the proposed method in the Zhuzhou Island microgrid. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
29. Guest Editorial: Special issue on computational methods and artificial intelligence applications in low‐carbon energy systems.
- Author
-
Wang, Yishen, Zhou, Fei, Guerrero, Josep M., Baker, Kyri, Chen, Yize, Wang, Hao, Xu, Bolun, Xu, Qianwen, Zhu, Hong, and Agwan, Utkarsha
- Subjects
ARTIFICIAL intelligence ,ARTIFICIAL neural networks ,MACHINE learning ,REINFORCEMENT learning ,DEEP reinforcement learning ,DEEP learning - Abstract
This document is a guest editorial for a special issue on computational methods and artificial intelligence applications in low-carbon energy systems. The editorial highlights the urgent need for advanced computing and artificial intelligence in the clean energy transition to improve system reliability, economics, and sustainability. The special issue includes 19 original research articles covering topics such as energy forecasting, situational awareness, multi-energy system dispatch, and power system operation. The articles present state-of-the-art methods and techniques in these areas, including wind power forecasting, demand-side flexibility, fault diagnosis of photovoltaic strings, and energy management strategies. The authors express their gratitude to the participating authors and anonymous reviewers for their contributions to the special section. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
30. Research on Scheduling Algorithm of Knitting Production Workshop Based on Deep Reinforcement Learning.
- Author
-
Sun, Lei, Shi, Weimin, Xuan, Chang, and Zhang, Yongchao
- Abstract
Intelligent scheduling of knitting workshops is the key to realizing knitting intelligent manufacturing. In view of the uncertainty of the workshop environment, it is difficult for existing scheduling algorithms to flexibly adjust scheduling strategies. This paper proposes a scheduling algorithm architecture based on deep reinforcement learning (DRL). First, the scheduling problem of knitting intelligent workshops is represented by a disjunctive graph, and a mathematical model is established. Then, a multi-proximal strategy (multi-PPO) optimization training algorithm is designed to obtain the optimal strategy, and the job selection strategy and machine selection strategy are trained at the same time. Finally, a knitting intelligent workshop scheduling experimental platform is built, and the algorithm proposed in this paper is compared with common heuristic rules and metaheuristic algorithms for experimental testing. The results show that the algorithm proposed in this paper is superior to heuristic rules in solving the knitting workshop scheduling problem, and can achieve the accuracy of the metaheuristic algorithm. In addition, the response speed of the algorithm in this paper is excellent, which meets the production scheduling needs of knitting intelligent workshops and has a good guiding significance for promoting knitting intelligent manufacturing. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Low-carbon economic dispatch strategy for integrated electrical and gas system with GCCP based on multi-agent deep reinforcement learning.
- Author
-
Feng, Wentao, Deng, Bingyan, Zhang, Ziwen, Jiang, He, Zheng, Yanxi, Peng, Xinran, Zhang, Le, Jing, Zhiyuan, Qing, Ke, Xi, Xianpeng, Zhang, Bin, and Li, Mingxuan
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,MACHINE learning ,CARBON emissions ,NATURAL gas ,DEEP learning - Abstract
With the growing concern for the environment, sustainable development centred on a low-carbon economy has become a unifying pursuit for the energy industry. Integrated energy systems (IES) that combine multiple energy sources such as electricity, heat and gas are essential to facilitate the consumption of renewable energy and the reduction of carbon emission. In this paper, gas turbine (GT), carbon capture and storage (CCS) and power-to-gas (P2G) device are introduced to construct a new carbon capture coupling device model, GT-CCS-P2G (GCCP), which is applied to the integrated electrical and gas system (IEGS). Multi-agent soft actor critic (MASAC) applies historical trajectory representations, parameter spatial techniques and deep densification frameworks to reinforcement learning for reducing the detrimental effects of time-series data on the decisional procedure. The energy scheduling problem of IEGS is redefined as a Markov game, which is addressed by adopting a low carbon economic control framework based on MASAC with minimum operating cost and minimum carbon emission as the optimization objectives. To validate the rationality and effectiveness of the proposed low-carbon economy scheduling model of IEGS based on MASAC, this paper simulates and analyses in integrated PJM-5 node system and seven nodes natural gas system. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
32. UAV Coverage Path Planning With Limited Battery Energy Based on Improved Deep Double Q-network.
- Author
-
Ni, Jianjun, Gu, Yu, Gu, Yang, Zhao, Yonghao, and Shi, Pengfei
- Abstract
In response to the increasingly complex problem of patrolling urban areas, the utilization of deep reinforcement learning algorithms for autonomous unmanned aerial vehicle (UAV) coverage path planning (CPP) has gradually become a research hotspot. CPP's solution needs to consider several complex factors, including landing area, target area coverage and limited battery capacity. Consequently, based on incomplete environmental information, policy learned by sample inefficient deep reinforcement learning algorithms are prone to getting trapped in local optima. To enhance the quality of experience data, a novel reward is proposed to guide UAVs in efficiently traversing the target area under battery limitations. Subsequently, to improve the sample efficiency of deep reinforcement learning algorithms, this paper introduces a novel dynamic soft update method, incorporates the prioritized experience replay mechanism, and presents an improved deep double Q-network (IDDQN) algorithm. Finally, simulation experiments conducted on two different grid maps demonstrate that IDDQN outperforms DDQN significantly. Our method simultaneously enhances the algorithm's sample efficiency and safety performance, thereby enabling UAVs to cover a larger number of target areas. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
33. Computation Offloading with Privacy-Preserving in Multi-Access Edge Computing: A Multi-Agent Deep Reinforcement Learning Approach.
- Author
-
Dai, Xiang, Luo, Zhongqiang, and Zhang, Wei
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,EDGE computing ,REINFORCEMENT (Psychology) ,TELECOMMUNICATION ,QUALITY of service ,INTERNET of things - Abstract
The rapid development of mobile communication technologies and Internet of Things (IoT) devices has introduced new challenges for multi-access edge computing (MEC). A key issue is how to efficiently manage MEC resources and determine the optimal offloading strategy between edge servers and user devices, while also protecting user privacy and thereby improving the Quality of Service (QoS). To address this issue, this paper investigates a privacy-preserving computation offloading scheme, designed to maximize QoS by comprehensively considering privacy protection, delay, energy consumption, and the task discard rate of user devices. We first formalize the privacy issue by introducing the concept of privacy entropy. Then, based on quantified indicators, a multi-objective optimization problem is established. To find an optimal solution to this problem, this paper proposes a computation offloading algorithm based on the Twin delayed deep deterministic policy gradient (TD3-SN-PER), which integrates clipped double-Q learning, prioritized experience replay, and state normalization techniques. Finally, the proposed method is evaluated through simulation analysis. The experimental results demonstrate that our approach can effectively balance multiple performance metrics to achieve optimal QoS. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
34. Deep reinforcement learning based multi-layered traffic scheduling scheme in data center networks.
- Author
-
Wu, Guihua
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,SERVER farms (Computer network management) ,COMPUTER network traffic ,SOFTWARE-defined networking ,STREAMING video & television - Abstract
A web search, an online video, a connected Nest device, and hundreds of cloud services all give us a response in a fraction of a second. But what really happens when we click search or send a request. The request travels over the public internet and into fiber network. Millions of requests or packets of data travel through miles of cable over land and under sea, converging at one of the many data centers that operate all over the world. Data Center (DC) is the core site of data operation, storage and forwarding, which is also an important part of cloud platform. A large number of commercial switches and servers are usually deployed at the DC. The DC is a complex set of facilities. Data center networks (DCN) is a network applied in the DC, because the traffic in DC presents the typical characteristics of centralized exchange data and increased traffic, which puts forward further requirements for the DCN. DCN connects a large-scale server cluster and is a bridge for data transmission and storage. With the expansion of DC and the increasing number of service types, communication within data centers becomes more frequent. On the other hand, traffic between data centers has also increased dramatically. Considering the multi-layered transmission mode and traffic characteristics of DCN, this paper proposes a software defined networking (SDN) -based multi-layered traffic scheduling scheme for DCN, which is mainly focused on hop count, criticality and cost. Moreover, based on SDN architecture and Deep Q-Network of Reinforcement Learning (RL), the intelligent multi-layered traffic scheduling scheme is proposed to obtain the current optimal global routing strategy according to the real-time traffic demand in the network. The simulation results show that the proposed scheme outperforms benchmarks in terms of average throughput, normalized total throughput, link bandwidth utilization, average round-trip time and network traffic bandwidth loss rate. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
35. Production optimisation in carbon reduction engineering management.
- Author
-
Wei, Yi-Ming, Huang, Zhimin, Coffman Dalton, D'Maris, Liao, Hua, and Wang, Ke
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,SUSTAINABILITY ,GREENHOUSE gases ,FLOW shop scheduling ,SUPPLY chain management ,PROCESS optimization ,ENVIRONMENTAL literacy - Abstract
This document discusses the importance of carbon reduction engineering in combating global climate change. It highlights the role of production and supply chain management in achieving carbon neutrality and reducing carbon emissions. The document presents 20 selected research papers that cover various aspects of production optimization in carbon reduction engineering, including carbon reduction in the production process, low-carbon and sustainable supply chains, assessment and optimization methodologies, and policy instruments. The papers provide insights into topics such as scheduling optimization, renewable energy production, carbon emission efficiency, sustainable consumption, government subsidizing arrangements, and the impact of policy instruments on carbon reduction. The document expresses gratitude to the authors and reviewers for their contributions and acknowledges the support of the International Journal of Production Research. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
36. High-Frequency Quantitative Trading of Digital Currencies Based on Fusion of Deep Reinforcement Learning Models with Evolutionary Strategies.
- Author
-
Yijun He, Bo Xu, and Xinpu Su
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,ELECTRONIC money ,MACHINE learning ,CRYPTOCURRENCIES ,EVOLUTIONARY models - Abstract
High-frequency quantitative trading in the emerging digital currency market poses unique challenges due to the lack of established methods for extracting trading information. This paper proposes a deep evolutionary reinforcement learning (DERL) model that combines deep reinforcement learning with evolutionary strategies to address these challenges. Reinforcement learning is applied to data cleaning and factor extraction from a high-frequency, microscopic view-point to quantitatively explain the supply and demand imbalance and to create trading strategies. In order to determine whether the algorithm can successfully extract the significant hidden features in the factors when faced with large and complex high-frequency factors, this paper trains the agent in reinforcement learning using three different learning algorithms, including Q-learning, evolutionary strategies, and policy gradient. The experimental dataset, which contains data on sharp up, sharp down, and continuous oscillation situations, was chosen to test Bitcoin in January-February, September, and November of 2022. According to the experimental results, the evolutionary strategies algorithm achieved returns of 59.18%, 25.14%, and 22.72%, respectively. The results demonstrate that deep reinforcement learning based on the evolutionary strategies outperforms Q-learning and policy gradient concerning risk resistance and return capability. The proposed approach offers a robust and adaptive solution for high-frequency trading in the digital currency market, contributing to the development of effective quantitative trading strategies. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
37. DRL-based Task and Computational Offloading for Internet of Vehicles in Decentralized Computing.
- Author
-
Zhang, Ziyang, Gu, Keyu, and Xu, Zijie
- Abstract
This paper focuses on the problem of computation offloading in a high-mobility Internet of Vehicles (IoVs) environment. The goal is to address the challenges related to latency, energy consumption, and payment cost requirements. The approach considers both moving and parked vehicles as fog nodes, which can assist in offloading computational tasks. However, as the number of vehicles increases, the action space for each agent grows exponentially, posing a challenge for decentralised decision-making. The dynamic nature of vehicular mobility further complicates the network dynamics, requiring joint cooperative behaviour from the learning agents to achieve convergence. The traditional deep reinforcement learning (DRL) approach for offloading in IoVs treats each agent as an independent learner. It ignores the actions of other agents during the training process. This paper utilises a cooperative three-layer decentralised architecture called Vehicle-Assisted Multi-Access Edge Computing (VMEC) to overcome this limitation. The VMEC network consists of three layers: the fog, cloudlet, and cloud layers. In the fog layer, vehicles within associated Roadside Units (RSUs) and neighbouring RSUs participate as fog nodes. The middle layer comprises Mobile Edge Computing (MEC) servers, while the top layer represents the cloud infrastructure. To address the dynamic task offloading problem in VMEC, the paper proposes using a Decentralized Framework of Task and Computational Offloading (DFTCO), which utilises the strength of MADRL and NOMA techniques. This approach considers multiple agents making offloading decisions simultaneously and aims to find the optimal matching between tasks and available resources. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
38. Resource Scheduling in URLLC and eMBB Coexistence Based on Dynamic Selection Numerology.
- Author
-
Wang, Lei, Tao, Sijie, Zhao, Lindong, Zhou, Dengyou, Liu, Zhe, and Sun, Yanbing
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,WIRELESS Internet ,RESOURCE allocation ,SIMULATION software ,FEATURE selection - Abstract
This paper focuses on the resource allocation problem of multiplexing two different service scenarios, enhanced mobile broadband (eMBB) and ultrareliable low latency (URLLC) in 5G New Radio, based on dynamic numerology structure, mini-time slot scheduling, and puncturing to achieve optimal resource allocation. To obtain the optimal channel resource allocation under URLLC user constraints, this paper establishes a relevant channel model divided into two convex optimization problems: (a) eMBB resource allocation and (b) URLLC scheduling. We also determine the numerology values at the beginning of each time slot with the help of deep reinforcement learning to achieve flexible resource scheduling. The proposed algorithm is verified in simulation software, and the simulation results show that the dynamic selection of numerologies proposed in this paper can better improve the data transmission rate of eMBB users and reduce the latency of URLLC services compared with the fixed numerology scheme for the same URLLC packet arrival, while the reasonable resource allocation ensures the reliability of URLLC and eMBB communication. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
39. Bioinspired Artificial Intelligence Applications 2023.
- Author
-
Wei, Haoran, Tao, Fei, Huang, Zhenghua, and Long, Yanhua
- Subjects
ARTIFICIAL intelligence ,DEEP learning ,REINFORCEMENT learning ,MACHINE learning ,DEEP reinforcement learning ,NATURAL language processing - Abstract
This document discusses the rapid development of Artificial Intelligence (AI) and its bioinspired applications. It highlights the benefits of bioinspired AI, such as increased accuracy in image and speech processing, reduced cost and energy usage through edge devices, and enhanced bio-signal quality. However, it also acknowledges the challenges posed by improper AI utilization, such as the generation of fake news and security issues. The document calls for research papers on bioinspired AI applications to explore its potential and address these challenges. It includes examples of research papers that utilize deep reinforcement learning for robot task sequencing, propose a real-time multi-surveillance pedestrian target detection model, develop an intelligent breast mass classification approach, and introduce a bio-inspired object detection algorithm for remote sensing images. The document concludes by emphasizing the importance of biomimetic artificial intelligence in various fields and promoting further research in this area. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
40. AI Applications to Enhance Resilience in Power Systems and Microgrids—A Review.
- Author
-
Zahraoui, Younes, Korõtko, Tarmo, Rosin, Argo, Mekhilef, Saad, Seyedmahmoudian, Mehdi, Stojcevski, Alex, and Alhamrouni, Ibrahim
- Abstract
This paper presents an in-depth exploration of the application of Artificial Intelligence (AI) in enhancing the resilience of microgrids. It begins with an overview of the impact of natural events on power systems and provides data and insights related to power outages and blackouts caused by natural events in Estonia, setting the context for the need for resilient power systems. Then, the paper delves into the concept of resilience and the role of microgrids in maintaining power stability. The paper reviews various AI techniques and methods, and their application in power systems and microgrids. It further investigates how AI can be leveraged to improve the resilience of microgrids, particularly during different phases of an event occurrence time (pre-event, during event, and post-event). A comparative analysis of the performance of various AI models is presented, highlighting their ability to maintain stability and ensure a reliable power supply. This comprehensive review contributes significantly to the existing body of knowledge and sets the stage for future research in this field. The paper concludes with a discussion of future work and directions, emphasizing the potential of AI in revolutionizing power system monitoring and control. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
41. Data-driven active corrective control in power systems: an interpretable deep reinforcement learning approach.
- Author
-
Li, Beibei, Liu, Qian, Hong, Yue, He, Yuxiong, Zhang, Lihong, He, Zhihong, Feng, Xiaoze, Gao, Tianlu, Yang, Li, Yan, Ziming, and Zhang, Cong
- Subjects
DEEP reinforcement learning ,ARTIFICIAL intelligence ,REINFORCEMENT learning ,MARKOV processes ,DECISION making - Abstract
With the successful application of artificial intelligence technology in various fields, deep reinforcement learning (DRL) algorithms have applied in active corrective control in the power system to improve accuracy and efficiency. However, the "black-box" nature of deep reinforcement learning models reduces their reliability in practical applications, making it difficult for operators to comprehend the decision-making mechanism. process of these models, thus undermining their credibility. In this paper, a DRL model is constructed based on the Markov decision process (MDP) to effectively address active corrective control issues in a 36-bus system. Furthermore, a feature importance explainability method is proposed, validating that the proposed feature importance-based explainability method enhances the transparency and reliability of the DRL model for active corrective control. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
42. A path planning method based on deep reinforcement learning for crowd evacuation.
- Author
-
Meng, Xiangdong, Liu, Hong, and Li, Wenhao
- Abstract
Deep reinforcement learning (DRL) is suitable for solving complex path-planning problems due to its excellent ability to make continuous decisions in a complex environment. However, the increase in the population size in the crowd evacuation path-planning problem causes a substantial computational burden for the algorithm, which leads to an unsatisfactory efficiency of the current DRL algorithm. This paper presents a path planning method based on DRL for crowd evacuation to solve the problem. First, we divide crowds into groups based on their relationship and distance from each other and select leaders from them. Next, we expand the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) to propose an Optimized Multi-Agent Deep Deterministic Policy Gradient (OMADDPG) algorithm to obtain the global evacuation path. The OMADDPG algorithm uses the Cross-Entropy Method (CEM) to optimize policy and improve the neural network's training efficiency by applying the Data Pruning (DP) algorithm. In addition, the social force model is improved, incorporating the relationship between individuals and psychological factors into the model. Finally, this paper combines the improved social force model and the OMADDPG algorithm. The OMADDPG algorithm transmits the path information to the leaders. Pedestrians in the environment are driven by the improved social force model to follow the leaders to complete the evacuation simulation. The method can use a leader to guide pedestrians safely arrive the exit and reduce evacuation time in different environments. The simulation results prove the efficiency of the path planning method. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
43. UAV Path Planning Based on Random Obstacle Training and Linear Soft Update of DRL in Dense Urban Environment.
- Author
-
Zhu, Yanfei, Tan, Yingjie, Chen, Yongfa, Chen, Liudan, and Lee, Kwang Y.
- Subjects
REINFORCEMENT learning ,DEEP reinforcement learning ,ENERGY consumption ,CONSUMPTION (Economics) - Abstract
The three-dimensional (3D) path planning problem of an Unmanned Aerial Vehicle (UAV) considering the effect of environmental wind in a dense city is investigated in this paper. The mission of the UAV is to fly from its initial position to its destination while ensuring safe flight. The dense obstacle avoidance and the energy consumption in 3D space need to be considered during the mission, which are often ignored in common studies. To solve these problems, an improved Deep Reinforcement Learning (DRL) path planning algorithm based on Double Deep Q-Network (DDQN) is proposed in this paper. Among the algorithms, the random obstacle training method is first proposed to make the algorithm consider various flight scenarios more globally and comprehensively and improve the algorithm's robustness and adaptability. Then, the linear soft update strategy is employed to realize the smooth neural network parameter update, which enhances the stability and convergence of the training. In addition, the wind disturbances are integrated into the energy consumption model and reward function, which can effectively describe the wind disturbances during the UAV mission to achieve the minimum drag flight. To prevent the neural network from interfering with training failures, the meritocracy mechanism is proposed to enhance the algorithm's stability. The effectiveness and applicability of the proposed method are verified through simulation analysis and comparative studies. The UAV based on this algorithm has good autonomy and adaptability, which provides a new way to solve the UAV path planning problem in dense urban scenes. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Reimagining space layout design through deep reinforcement learning.
- Author
-
Kakooee, Reza and Dillenburger, Benjamin
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,COMPUTER-aided design software ,ARCHITECTURAL design ,GENETIC algorithms - Abstract
Space layout design is a critical aspect of architectural design, influencing functionality and aesthetics. The inherent combinatorial nature of layout design poses challenges for traditional planning approaches; thus, it demands the exploration of novel methods. This paper presents a novel framework that leverages the potential of deep reinforcement learning (RL) algorithms to optimize space layouts. RL has demonstrated remarkable success in addressing complex decision-making problems, yet its application in the design process remains relatively unexplored. We argue that RL is particularly well-suited for the design process due to its ability to accommodate offline tasks and seamless integration with existing computer-aided design software, effectively acting as a simulator for design exploration. Framing space layout design as an RL problem and employing RL methods allows for the automated exploration of the expansive design space, thereby enhancing the discovery of innovative solutions. This paper also elucidates the synergy between the design process and the RL problem, which opens new avenues for exploring the potential of RL algorithms in design. We aim to foster experimentation and collaboration within the RL and architecture communities. To facilitate our research, we have developed SpaceLayoutGym , an environment specifically designed for space layout design tasks. SpaceLayoutGym serves as a customizable environment that encapsulates the essential elements of the layout design process within an RL framework. To showcase the effectiveness of SpaceLayoutGym and the capabilities of RL as an artificial space layout designer, we employ the Proximal Policy Optimization (PPO) algorithm to train the RL agent in selected design scenarios with both geometrical constraints and topological objectives. The study further extends to contrast the effectiveness of PPO agents with that of genetic algorithms, and also includes a comparative analysis with existing layouts. Our results demonstrate the potential of RL to optimize space layouts, offering a promising direction for the future of artificial intelligence-aided design. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
45. A Modular Robotic Arm Configuration Design Method Based on Double DQN with Prioritized Experience Replay.
- Author
-
Ding, Ziyan, Tang, Haijun, Wan, Haiying, Zhang, Chengxi, and Sun, Ran
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,ROBOTICS - Abstract
The modular robotic arms can achieve desired performances in different scenarios through the combination of various modules, and concurrently hold the potential to exhibit geometric symmetry and uniform mass symmetry. Therefore, selecting the appropriate combination of modules is crucial for realizing the functions of the robotic arm and ensuring the elegance of the system. To this end, this paper proposes a double deep Q-network (DDQN)-based configuration design algorithm for modular robotic arms, which aims to find the optimal configuration under different tasks. First, a library of small modules of collaborative robotic arms consisting of multiple tandem robotic arms is constructed. These modules are described in a standard format that can be directly imported into the software for simulation, providing greater convenience and flexibility in the development of modular robotic arms. Subsequently, the DDQN design framework for module selection is established to obtain the optimal robotic arm configuration. The proposed method could deal with the overestimation problem in the traditional deep Q-network (DQN) method and improve the estimation accuracy of the value function for each module. In addition, the experience replay mechanism is improved based on the SumTree technique, which enables the algorithm to make effective use of historical experience and prevents the algorithm from falling into local optimal solutions. Finally, comparative experiments are carried out on the PyBullet simulation platform to verify the effectiveness and superiority of the configuration design method developed in the paper. The simulation results show that the proposed DDQN-based method with experience replay mechanism has higher search efficiency and accuracy compared to the traditional DQN scheme. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
46. Quantum-inspired deep reinforcement learning for adaptive frequency control of low carbon park island microgrid considering renewable energy sources.
- Author
-
Shen, Xin, Tang, Jianlin, Pan, Feng, Qian, Bin, Zhao, Yitao, Yang, Cheng, Yin, Linfei, and Cheng, Miao
- Subjects
REINFORCEMENT learning ,DEEP reinforcement learning ,RENEWABLE energy sources ,ADAPTIVE control systems ,MACHINE learning ,TRANSVERSE reinforcements ,BIOLOGICALLY inspired computing - Abstract
The low carbon park islanded microgrid faces operational challenges due to the high variability and uncertainty of distributed renewable energy sources. These sources cause severe random disturbances that impair the frequency control performance and increase the regulation cost of the islanded microgrid, jeopardizing its safety and stability. This paper presents a data-driven intelligent load frequency control (DDI-LFC) method to address this problem. The method replaces the conventional LFC controller with an intelligent agent based on a deep reinforcement learning algorithm. To adapt to the complex islanded microgrid environment and achieve adaptive multi-objective optimal frequency control, this paper proposes the quantum-inspired maximum entropy actor-critic (QIS-MEAC) algorithm, which incorporates the quantum-inspired principle and the maximum entropy exploration strategy into the actor-critic algorithm. The algorithm transforms the experience into a quantum state and leverages the quantum features to improve the deep reinforcement learning's experience replay mechanism, enhancing the data efficiency and robustness of the algorithm and thus the quality of DDI-LFC. The validation on the Yongxing Island isolated microgrid model of China Southern Grid (CSG) demonstrates that the proposed method utilizes the frequency regulation potential of distributed generation, and reduces the frequency deviation and generation cost. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
47. Research on load frequency control of multi‐microgrids in an isolated system based on the multi‐agent soft actor‐critic algorithm.
- Author
-
Xie, Li Long, Li, Yonghui, Fan, Peixiao, Wan, Li, Zhang, Kanjun, and Yang, Jun
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,MULTIAGENT systems ,DISTRIBUTED algorithms ,ALGORITHMS ,FREQUENCY stability ,MICROGRIDS - Abstract
Load variation, distributed power output uncertainty and multi‐microgrids network complexity have brought great difficulties to the frequency stability of the whole microgrid. To address this problem, this paper uses a multi‐agent deep reinforcement learning(DRL) algorithm to design the controllers to control the frequency of the multi‐microgrids. Firstly, a load frequency control (LFC) model for multi‐microgrids was built. Secondly, based on the centralized training and decentralized execution (CTDE) multi‐agent reinforcement learning (RL) framework, the multi‐agent soft actor‐critic (MASAC) algorithm was designed and applied to the multi‐microgrids model. The state space and action space of multi‐agent were established according to the frequency deviation of every sub‐microgrid and the output of each distributed power source. The reward function was then established according to the frequency deviation. The appropriate neural network and training parameters were selected to generate the interconnected microgrid controllers through multiple training of pre‐learning. Finally, the simulation study shows that the MASAC controller proposed in this paper can quickly maintain frequency stability when the system is disturbed. Sensitivity analysis shows that the MASAC controller can effectively cope with the uncertainty of the system parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
48. Energy-efficient UAV-enabled computation offloading for industrial internet of things: a deep reinforcement learning approach
- Author
-
Shi, Shuo, Wang, Meng, Gu, Shushi, and Zheng, Zhong
- Published
- 2024
- Full Text
- View/download PDF
49. An improved deep reinforcement learning-based scheduling approach for dynamic task scheduling in cloud manufacturing.
- Author
-
Xiaohan Wang, Lin Zhang, Yongkui Liu, and Yuanjun Laili
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,SCHEDULING - Abstract
Dynamic task scheduling problem in cloud manufacturing (CMfg) is always challenging because of changing manufacturing requirements and services. To make instant decisions for task requirements, deep reinforcement learning-based (DRL-based) methods have been broadly applied to learn the scheduling policies of service providers. However, the current DRL-based scheduling methods struggle to fine-tune a pre-trained policy effectively. The resulting training from scratch takes more time and may easily overfit the environment. Additionally, most DRL-based methods with uneven action distribution and inefficient output masks largely reduce the training efficiency, thus degrading the solution quality. To this end, this paper proposes an improved DRL-based approach for dynamic task scheduling in CMfg. First, the paper uncovers the causes behind the inadequate fine-tuning ability and low training efficiency observed in existing DRL-based scheduling methods. Subsequently, a novel approach is proposed to address these issues by updating the scheduling policy while considering the distribution distance between the pre-training dataset and the in-training policy. Uncertainty weights are introduced to the loss function, and the output mask is extended to the updating procedures. Numerical experiments on thirty actual scheduling instances validate that the solution quality and generalization of the proposed approach surpass other DRL-based methods at most by 32.8% and 28.6%, respectively. Additionally, our method can effectively fine-tune a pre-trained scheduling policy, resulting in an average reward increase of up to 23.8%. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
50. MADDPG-D2: An Intelligent Dynamic Task Allocation Algorithm Based on Multi-Agent Architecture Driven by Prior Knowledge.
- Author
-
Li, Tengda, Wang, Gang, and Fu, Qiang
- Subjects
REINFORCEMENT learning ,DEEP reinforcement learning ,MACHINE learning ,MULTIAGENT systems ,PROBLEM solving - Abstract
Aiming at the problems of low solution accuracy and high decision pressure when facing large-scale dynamic task allocation (DTA) and high-dimensional decision space with single agent, this paper combines the deep reinforcement learning (DRL) theory and an improved Multi-Agent Deep Deterministic Policy Gradient (MADDPG-D2) algorithm with a dual experience replay pool and a dual noise based on multi-agent architecture is proposed to improve the efficiency of DTA. The algorithm is based on the traditional Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm, and considers the introduction of a double noise mechanism to increase the action exploration space in the early stage of the algorithm, and the introduction of a double experience pool to improve the data utilization rate; at the same time, in order to accelerate the training speed and efficiency of the agents, and to solve the cold-start problem of the training, the a priori knowledge technology is applied to the training of the algorithm. Finally, the MADDPG-D2 algorithm is compared and analyzed based on the digital battlefield of ground and air confrontation. The experimental results show that the agents trained by the MADDPG-D2 algorithm have higher win rates and average rewards, can utilize the resources more reasonably, and better solve the problem of the traditional single agent algorithms facing the difficulty of solving the problem in the high-dimensional decision space. The MADDPG-D2 algorithm based on multi-agent architecture proposed in this paper has certain superiority and rationality in DTA. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.