24 results on '"deep-reinforcement learning"'
Search Results
2. The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework.
- Author
-
Olayemi, Kabirat Bolanle, Van, Mien, McLoone, Sean, McIlvanna, Stephen, Sun, Yuzhu, Close, Jack, and Nguyen, Nhat Minh
- Subjects
- *
DEEP reinforcement learning , *REINFORCEMENT learning , *LIDAR , *OPTICAL radar , *ROBOTIC path planning , *TIME complexity - Abstract
Over the years, deep reinforcement learning (DRL) has shown great potential in mapless autonomous robot navigation and path planning. These DRL methods rely on robots equipped with different light detection and range (LiDAR) sensors with a wide field of view (FOV) configuration to perceive their environment. These types of LiDAR sensors are expensive and are not suitable for small-scale applications. In this paper, we address the performance effect of the LiDAR sensor configuration in DRL models. Our focus is on avoiding static obstacles ahead. We propose a novel approach that determines an initial FOV by calculating an angle of view using the sensor's width and the minimum safe distance required between the robot and the obstacle. The beams returned within the FOV, the robot's velocities, the robot's orientation to the goal point, and the distance to the goal point are used as the input state to generate new velocity values as the output action of the DRL. The cost function of collision avoidance and path planning is defined as the reward of the DRL model. To verify the performance of the proposed method, we adjusted the proposed FOV by ±10° giving a narrower and wider FOV. These new FOVs are trained to obtain collision avoidance and path planning DRL models to validate the proposed method. Our experimental setup shows that the LiDAR configuration with the computed angle of view as its FOV performs best with a success rate of 98% and a lower time complexity of 0.25 m/s. Additionally, using a Husky Robot, we demonstrate the model's good performance and applicability in the real world. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. DRL-Based Dynamic Destroy Approaches for Agile-Satellite Mission Planning.
- Author
-
Huang, Wei, Li, Zongwang, He, Xiaohe, Xiang, Junyan, Du, Xu, and Liang, Xuwen
- Subjects
- *
REMOTE sensing - Abstract
Agile-satellite mission planning is a crucial issue in the construction of satellite constellations. The large scale of remote sensing missions and the high complexity of constraints in agile-satellite mission planning pose challenges in the search for an optimal solution. To tackle the issue, a dynamic destroy deep-reinforcement learning (D3RL) model is designed to facilitate subsequent optimization operations via adaptive destruction to the existing solutions. Specifically, we first perform a clustering and embedding operation to reconstruct tasks into a clustering graph, thereby improving data utilization. Secondly, the D3RL model is established based on graph attention networks (GATs) to enhance the search efficiency for optimal solutions. Moreover, we present two applications of the D3RL model for intensive scenes: the deep-reinforcement learning (DRL) method and the D3RL-based large-neighborhood search method (DRL-LNS). Experimental simulation results illustrate that the D3RL-based approaches outperform the competition in terms of solutions' quality and computational efficiency, particularly in more challenging large-scale scenarios. DRL-LNS outperforms ALNS with an average scheduling rate improvement of approximately 11% in Area instances. In contrast, the DRL approach performs better in World scenarios, with an average scheduling rate that is around 8% higher than that of ALNS. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. 移动机器人行人避让策略强化学习研究.
- Author
-
王唯鉴, 王 勇, 杨 骁, 吕宗喆, and 吴宗毅
- Subjects
DEEP reinforcement learning ,REINFORCEMENT learning ,MOBILE robots ,FUNCTION spaces ,PEDESTRIANS ,ROBOTS - Abstract
Copyright of Journal of Computer Engineering & Applications is the property of Beijing Journal of Computer Engineering & Applications Journal Co Ltd. and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
5. Scheduling single-satellite observation and transmission tasks by using hybrid Actor-Critic reinforcement learning.
- Author
-
Wen, Zhijiang, Li, Lu, Song, Jiakai, Zhang, Shengyu, and Hu, Haiying
- Subjects
- *
REINFORCEMENT learning , *ARTIFICIAL satellites , *DATA transmission systems , *SCHEDULING , *DATA warehousing , *ELECTRONIC data processing - Abstract
• Integrated scheduling data transmission and observation tasks to achieve better schedule results. • In order to make more flexible scheduling, a time-continuous model for the data transmission process is established to achieve accurate time decisions. • A hybrid Actor-Critic reinforcement learning method is designed for solving single-satellite observation and transmission task scheduling problems, which performs well under intensive observation scenarios. • Different training methods are applied to meet various scheduling requirements and random scenarios. Earth observation satellites(EOS) generate a large amount of observation data in intensive observation scenarios, while the ability of data storage on the EOS is limited. It makes integrating satellite observation and data transmission tasks for EOS imperative. This paper establishes a time-continuous model for the single EOS integrated scheduling problem, which considers data transmission and observation simultaneously. A hybrid Actor-Critic reinforcement learning method is adopted to solve the EOS integrated scheduling problem for a more efficient solution in intensive observation scenarios. Furthermore, the algorithm can flexibly determine the start and end time of the data transmission task. Experimental results show that the hybrid Actor-Critic reinforcement learning method deals with a large scale of problems with high efficiency and good results. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
6. Dynamic User Resource Allocation for Downlink Multicarrier NOMA with an Actor–Critic Method.
- Author
-
Wang, Xinshui, Meng, Ke, Wang, Xu, Liu, Zhibin, and Ma, Yuefeng
- Subjects
- *
RESOURCE allocation , *WIRELESS communications , *LEARNING ability , *COMBINATORIAL optimization , *HIGH performance computing , *MACHINE learning - Abstract
Future wireless communication systems require higher performance requirements. Based on this, we study the combinatorial optimization problem of power allocation and dynamic user pairing in a downlink multicarrier non-orthogonal multiple-access (NOMA) system scenario, aiming at maximizing the user sum rate of the overall system. Due to the complex coupling of variables, it is difficult and time-consuming to obtain an optimal solution, making engineering impractical. To circumvent the difficulties and obtain a sub-optimal solution, we decompose this optimization problem into two sub-problems. First, a closed-form expression for the optimal power allocation scheme is obtained for a given subchannel allocation. Then, we provide the optimal user-pairing scheme using the actor–critic (AC) algorithm. As a promising approach to solving the exhaustive problem, deep-reinforcement learning (DRL) possesses higher learning ability and better self-adaptive capability than traditional optimization methods. Simulation results have demonstrated that our method has significant advantages over traditional methods and other deep-learning algorithms, and effectively improves the communication performance of NOMA transmission to some extent. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. 基于深度强化学习的无人机自组网路由算法.
- Author
-
乔冠华, 吴麒, 王翔, 潘俊男, 张易新, and 丁建
- Subjects
REINFORCEMENT learning ,ROUTING algorithms ,AD hoc computer networks ,END-to-end delay ,STARTUP costs ,ACTIVE learning ,DRONE aircraft - Abstract
Copyright of Journal of Chongqing University of Posts & Telecommunications (Natural Science Edition) is the property of Chongqing University of Posts & Telecommunications and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
8. Integrated Guidance-and-Control Design for Three-Dimensional Interception Based on Deep-Reinforcement Learning.
- Author
-
Wang, Wenwen, Wu, Mingyu, Chen, Zhihua, and Liu, Xiaoli
- Subjects
REWARD (Psychology) ,ANGULAR velocity ,PROPORTIONAL navigation ,REINFORCEMENT learning ,DETERMINISTIC algorithms ,SYSTEM dynamics ,LEARNING - Abstract
This study applies deep-reinforcement-learning algorithms to integrated guidance and control for three-dimensional, high-maneuverability missile-target interception. Dynamic environment, reward functions concerning multi-factors, agents based on the deep-deterministic-policy-gradient algorithm, and action signals with pitch and yaw fins as control commands were constructed in the research, which control the missile in order to intercept targets. Firstly, the missile-interception system includes dynamics such as the inertia of the missile, the aerodynamic parameters, and fin delays. Secondly, to improve the convergence speed and guidance accuracy, a convergence factor for the angular velocity of the target line of sight and deep dual-filter methods were introduced into the design of the reward function. The method proposed in this paper was then compared with traditional proportional navigation. Next, many simulations were carried out on high-maneuverability targets with different initial conditions by randomization. The numerical-simulation results showed that the proposed guidance strategy has higher guidance accuracy and stronger robustness and generalization capability against the aerodynamic parameters. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
9. The Impact of LiDAR Configuration on Goal-Based Navigation within a Deep Reinforcement Learning Framework
- Author
-
Kabirat Bolanle Olayemi, Mien Van, Sean McLoone, Stephen McIlvanna, Yuzhu Sun, Jack Close, and Nhat Minh Nguyen
- Subjects
reinforcement learning ,deep-reinforcement learning ,collision avoidance ,husky ,gazebo ,LiDAR ,Chemical technology ,TP1-1185 - Abstract
Over the years, deep reinforcement learning (DRL) has shown great potential in mapless autonomous robot navigation and path planning. These DRL methods rely on robots equipped with different light detection and range (LiDAR) sensors with a wide field of view (FOV) configuration to perceive their environment. These types of LiDAR sensors are expensive and are not suitable for small-scale applications. In this paper, we address the performance effect of the LiDAR sensor configuration in DRL models. Our focus is on avoiding static obstacles ahead. We propose a novel approach that determines an initial FOV by calculating an angle of view using the sensor’s width and the minimum safe distance required between the robot and the obstacle. The beams returned within the FOV, the robot’s velocities, the robot’s orientation to the goal point, and the distance to the goal point are used as the input state to generate new velocity values as the output action of the DRL. The cost function of collision avoidance and path planning is defined as the reward of the DRL model. To verify the performance of the proposed method, we adjusted the proposed FOV by ±10° giving a narrower and wider FOV. These new FOVs are trained to obtain collision avoidance and path planning DRL models to validate the proposed method. Our experimental setup shows that the LiDAR configuration with the computed angle of view as its FOV performs best with a success rate of 98% and a lower time complexity of 0.25 m/s. Additionally, using a Husky Robot, we demonstrate the model’s good performance and applicability in the real world.
- Published
- 2023
- Full Text
- View/download PDF
10. Optimizing Flying Base Station Connectivity by RAN Slicing and Reinforcement Learning
- Author
-
Dick Carrillo Melgarejo, Jiri Pokorny, Pavel Seda, Arun Narayanan, Pedro H. J. Nardelli, Mehdi Rasti, Jiri Hosek, Milos Seda, Demostenes Z. Rodriguez, Yevgeni Koucheryavy, and Gustavo Fraidenraich
- Subjects
Flying base stations ,UAVs ,location optimization ,wireless communication ,deep-reinforcement learning ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The application of flying base stations (FBS) in wireless communication is becoming a key enabler to improve cellular wireless connectivity. Following this tendency, this research work aims to enhance the spectral efficiency of FBSs using the radio access network (RAN) slicing framework; this optimization considers that FBSs’ location was already defined previously. This framework splits the physical radio resources into three RAN slices. These RAN slices schedule resources by optimizing individual slice spectral efficiency by using a deep reinforcement learning approach. The simulation indicates that the proposed framework generally outperforms the spectral efficiency of the network that only considers the heuristic predefined FBS location, although the gains are not always significant in some specific cases. Finally, spectral efficiency is analyzed for each RAN slice resource and evaluated in terms of service-level agreement (SLA) to indicate the performance of the framework.
- Published
- 2022
- Full Text
- View/download PDF
11. DRL-Based Dynamic Destroy Approaches for Agile-Satellite Mission Planning
- Author
-
Wei Huang, Zongwang Li, Xiaohe He, Junyan Xiang, Xu Du, and Xuwen Liang
- Subjects
agile-satellite mission planning ,graph attention network ,deep-reinforcement learning ,large-neighborhood search ,dynamic destroy ,Science - Abstract
Agile-satellite mission planning is a crucial issue in the construction of satellite constellations. The large scale of remote sensing missions and the high complexity of constraints in agile-satellite mission planning pose challenges in the search for an optimal solution. To tackle the issue, a dynamic destroy deep-reinforcement learning (D3RL) model is designed to facilitate subsequent optimization operations via adaptive destruction to the existing solutions. Specifically, we first perform a clustering and embedding operation to reconstruct tasks into a clustering graph, thereby improving data utilization. Secondly, the D3RL model is established based on graph attention networks (GATs) to enhance the search efficiency for optimal solutions. Moreover, we present two applications of the D3RL model for intensive scenes: the deep-reinforcement learning (DRL) method and the D3RL-based large-neighborhood search method (DRL-LNS). Experimental simulation results illustrate that the D3RL-based approaches outperform the competition in terms of solutions’ quality and computational efficiency, particularly in more challenging large-scale scenarios. DRL-LNS outperforms ALNS with an average scheduling rate improvement of approximately 11% in Area instances. In contrast, the DRL approach performs better in World scenarios, with an average scheduling rate that is around 8% higher than that of ALNS.
- Published
- 2023
- Full Text
- View/download PDF
12. An inertial control method for large-scale wind farm based on hierarchical distributed hybrid deep-reinforcement learning.
- Author
-
Han, Ji and Chen, Zhe
- Subjects
- *
BLENDED learning , *FREQUENCY stability , *WIND power plants , *PARAMETER estimation , *WIND turbines , *LOCAL mass media , *MAXIMUM power point trackers , *ELECTRIC transients , *REINFORCEMENT learning - Abstract
Wind farm (WF) is gradually required to provide inertial support during frequency events nowadays. Existing WF inertial control methods have exhibited limitations in parameter estimation and optimal control parameter tuning under variable operating conditions. Moreover, traditional model-based methods suffer from computational efficiency problems due to their reliance on repeated iterations or derivative computations in optimization process. Thus, this paper proposes an inertial control method for large-scale WF based on hierarchical distributed hybrid deep-reinforcement learning (HDH-DRL). Firstly, the control objectives for the inertial control are defined, and are decomposed on the basis of wind turbines (WTs) division. Next, this paper presents a HDH-DRL based inertial control framework consisting of two levels. The upper-level control is achieved by the hybrid multi-agent DRL (HMA-DRL) algorithm with hybrid action exploration mechanism and multi-agent coordination. The consensus based lower-level control aims to achieve the consensus convergence of the control through the locally distributed interaction among WTs. Finally, the inertial control processes of the model are exhibited; the influences of DRL algorithms on the control are discussed; the computational speed and solution accuracy of the proposed method are compared with the model-based methods; the multi-scenario applicability and performance in local communication failures of the model are analyzed. The results demonstrate that the proposed method significantly improves the active power response and frequency stability, achieving rapid consensus convergence with a power deviation below 0.3%, surpassing traditional control methods by at least 0.2%; also, it demonstrates superior performance across various scenarios, including transient and steady-state conditions, with frequency enhancements up to 0.30% and exceptional stability under different wind conditions and communication disruptions. • Introduces HDH-DRL for improved WF inertial control and frequency stability. • Develops a bi-level HDH-DRL framework for faster, accurate control actions. • Utilizes HMA-DRL with a novel exploration mechanism for complex WF operations. • Achieves better accuracy and quicker computation than model-based methods. • Ensures control robustness across varied wind conditions and communication breaks. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
13. Dynamic User Resource Allocation for Downlink Multicarrier NOMA with an Actor–Critic Method
- Author
-
Xinshui Wang, Ke Meng, Xu Wang, Zhibin Liu, and Yuefeng Ma
- Subjects
NOMA ,deep-reinforcement learning ,actor–critic ,power allocation ,user pairing ,Technology - Abstract
Future wireless communication systems require higher performance requirements. Based on this, we study the combinatorial optimization problem of power allocation and dynamic user pairing in a downlink multicarrier non-orthogonal multiple-access (NOMA) system scenario, aiming at maximizing the user sum rate of the overall system. Due to the complex coupling of variables, it is difficult and time-consuming to obtain an optimal solution, making engineering impractical. To circumvent the difficulties and obtain a sub-optimal solution, we decompose this optimization problem into two sub-problems. First, a closed-form expression for the optimal power allocation scheme is obtained for a given subchannel allocation. Then, we provide the optimal user-pairing scheme using the actor–critic (AC) algorithm. As a promising approach to solving the exhaustive problem, deep-reinforcement learning (DRL) possesses higher learning ability and better self-adaptive capability than traditional optimization methods. Simulation results have demonstrated that our method has significant advantages over traditional methods and other deep-learning algorithms, and effectively improves the communication performance of NOMA transmission to some extent.
- Published
- 2023
- Full Text
- View/download PDF
14. Integrated Guidance-and-Control Design for Three-Dimensional Interception Based on Deep-Reinforcement Learning
- Author
-
Wenwen Wang, Mingyu Wu, Zhihua Chen, and Xiaoli Liu
- Subjects
three-dimensional ,deep-reinforcement learning ,integrated guidance and control ,high-maneuverability missile-target interception ,Motor vehicles. Aeronautics. Astronautics ,TL1-4050 - Abstract
This study applies deep-reinforcement-learning algorithms to integrated guidance and control for three-dimensional, high-maneuverability missile-target interception. Dynamic environment, reward functions concerning multi-factors, agents based on the deep-deterministic-policy-gradient algorithm, and action signals with pitch and yaw fins as control commands were constructed in the research, which control the missile in order to intercept targets. Firstly, the missile-interception system includes dynamics such as the inertia of the missile, the aerodynamic parameters, and fin delays. Secondly, to improve the convergence speed and guidance accuracy, a convergence factor for the angular velocity of the target line of sight and deep dual-filter methods were introduced into the design of the reward function. The method proposed in this paper was then compared with traditional proportional navigation. Next, many simulations were carried out on high-maneuverability targets with different initial conditions by randomization. The numerical-simulation results showed that the proposed guidance strategy has higher guidance accuracy and stronger robustness and generalization capability against the aerodynamic parameters.
- Published
- 2023
- Full Text
- View/download PDF
15. MODELLING MALWARE PROPAGATION ON THE INTERNET OF THINGS USING AN AGENT BASED APPROACH ON COMPLEX NETWORKS
- Author
-
Karanja Evanson Mwangi, Shedden Masupe, and Mandu Jeffrey
- Subjects
internet of things ,agent-based modelling and simulation ,modelling malware propagation ,large-scale-free networks ,deep-reinforcement learning ,Information technology ,T58.5-58.64 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Malware threat is a major hindrance to efficient information exchange on the Internet of Things (IoT). Modelling malware propagation is one of the most imperative applications aimed at understanding mechanisms for protecting the Internet of Things environment. Internet of Things can be realized using agent-based modelling over complex networks. In this paper, a malware propagation model using agent-based approach and deep-reinforcement learning on scale free network in IoT (SFIoT) is assiduously detailed. The proposed model is named based on transition states as Susceptible-Infected-Immuned-Recovered-Removed (SIIRR) that represents the states of nodes on large-scale complex networks. The reliability of each node is investigated using the Mean Time To Failure (MTTF). The factors considered for MTTF computations are: degree of a node, node mobility rate, node transmission rate and distance between two nodes computed using Euclidean distance. The results illustrate that the model is comparable to previous models on effects of malware propagation in terms of average energy consumption, average infections at time (t), node mobility and propagation speed.
- Published
- 2020
- Full Text
- View/download PDF
16. Portfolio trading system of digital currencies: A deep reinforcement learning with multidimensional attention gating mechanism.
- Author
-
Weng, Liguo, Sun, Xudong, Xia, Min, Liu, Jia, and Xu, Yiqing
- Subjects
- *
DEEP learning , *ELECTRONIC money , *FINANCIAL engineering , *REINFORCEMENT learning , *SHARPE ratio , *MACHINE learning - Abstract
As a hot topic in the financial engineering, the portfolio optimization aims to increase investors' wealth. In this paper, a portfolio management system based on deep-reinforcement learning is proposed. In contrast to inflexible traditional methods, the proposed system achieves a better trading strategy through Reinforcement learning. The reward signal of Reinforcement learning is updated by action weights from Deep learning networks. Low price, high price and close price constitute the inputs, but the importance of these three features is quite different. Traditional methods and the classical CNN can't deal with these three features separately, but in our method, a designed depth convolution is proposed to deal with these three features separately. In a virtual currency market, the price rise only occurs in a flash. Traditional methods and CNN networks can't accurately judge the critical time. In order to solve this problem, a three-dimensional attention gating network is proposed and it gives higher weights on rising moments and assets. Under different market conditions, the proposed system achieves more substantial returns and greatly improves the Sharpe ratios. The short-term risk index of the proposed system is lower than those of the traditional algorithms. Simulation results show that the traditional algorithms (including Best, CRP, PAMR, CWMR and CNN) are unable to perform as well as our approach. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
17. Resource Offload Consolidation Based on Deep-Reinforcement Learning Approach in Cyber-Physical Systems
- Author
-
Mahammad Shareef Mekala, Xi Zheng, Amjad Anvari-Moghaddam, P. Viswanathan, Gautam Srivastava, and Alireza Jolfaei
- Subjects
game theory ,Artificial intelligence ,Control and Optimization ,Edge device ,Computer science ,Distributed computing ,Processor scheduling ,Servers ,Cloud computing ,edge computing ,Artificial Intelligence ,Server ,Reinforcement learning ,resource provision ,deep-reinforcement learning ,Edge computing ,business.industry ,Resource management ,Deep learning ,Cyber-physical system ,Computational modeling ,Workload ,Computer Science Applications ,Computational Mathematics ,Task analysis ,Performance evaluation ,measurement systems ,business - Abstract
In cyber-physical systems, it is advantageous to leverage cloud with edge resources to distribute the workload for processing and computing user data at the point of generation. Services offered by cloud are not flexible enough against variations in the size of underlying data, which leads to increased latency, violation of deadline and higher cost. On the other hand, resolving above-mentioned issues with edge devices with limited resources is also challenging. In this work, a novel reinforcement learning algorithm, Capacity-Cost Ratio-Reinforcement Learning (CCR-RL), is proposed which considers both resource utilization and cost for the target cyber-physical systems. In CCR-RL, the task offloading decision is made considering data arrival rate, edge device computation power, and underlying transmission capacity. Then, a deep learning model is created to allocate resources based on the underlying communication and computation rate. Moreover, new algorithms are proposed to regulate the allocation of communication and computation resources for the workload among edge devices and edge servers. The simulation results demonstrate that the proposed method can achieve a minimal latency and a reduced processing cost compared to the state-of-the-art schemes.
- Published
- 2022
18. Optimizing Flying Base Station Connectivity by RAN Slicing and Reinforcement Learning
- Abstract
The application of flying base stations (FBS) in wireless communication is becoming a key enabler to improve cellular wireless connectivity. Following this tendency, this research work aims to enhance the spectral efficiency of FBSs using the radio access network (RAN) slicing framework; this optimization considers that FBSs’ location was already defined previously. This framework splits the physical radio resources into three RAN slices. These RAN slices schedule resources by optimizing individual slice spectral efficiency by using a deep reinforcement learning approach. The simulation indicates that the proposed framework generally outperforms the spectral efficiency of the network that only considers the heuristic predefined FBS location, although the gains are not always significant in some specific cases. Finally, spectral efficiency is analyzed for each RAN slice resource and evaluated in terms of service-level agreement (SLA) to indicate the performance of the framework.
- Published
- 2022
19. Optimizing Flying Base Station Connectivity by RAN Slicing and Reinforcement Learning
- Abstract
The application of flying base stations (FBS) in wireless communication is becoming a key enabler to improve cellular wireless connectivity. Following this tendency, this research work aims to enhance the spectral efficiency of FBSs using the radio access network (RAN) slicing framework; this optimization considers that FBSs’ location was already defined previously. This framework splits the physical radio resources into three RAN slices. These RAN slices schedule resources by optimizing individual slice spectral efficiency by using a deep reinforcement learning approach. The simulation indicates that the proposed framework generally outperforms the spectral efficiency of the network that only considers the heuristic predefined FBS location, although the gains are not always significant in some specific cases. Finally, spectral efficiency is analyzed for each RAN slice resource and evaluated in terms of service-level agreement (SLA) to indicate the performance of the framework.
- Published
- 2022
20. Optimizing Flying Base Station Connectivity by RAN Slicing and Reinforcement Learning
- Abstract
The application of flying base stations (FBS) in wireless communication is becoming a key enabler to improve cellular wireless connectivity. Following this tendency, this research work aims to enhance the spectral efficiency of FBSs using the radio access network (RAN) slicing framework; this optimization considers that FBSs’ location was already defined previously. This framework splits the physical radio resources into three RAN slices. These RAN slices schedule resources by optimizing individual slice spectral efficiency by using a deep reinforcement learning approach. The simulation indicates that the proposed framework generally outperforms the spectral efficiency of the network that only considers the heuristic predefined FBS location, although the gains are not always significant in some specific cases. Finally, spectral efficiency is analyzed for each RAN slice resource and evaluated in terms of service-level agreement (SLA) to indicate the performance of the framework.
- Published
- 2022
21. Optimizing Flying Base Station Connectivity by RAN Slicing and Reinforcement Learning
- Abstract
The application of flying base stations (FBS) in wireless communication is becoming a key enabler to improve cellular wireless connectivity. Following this tendency, this research work aims to enhance the spectral efficiency of FBSs using the radio access network (RAN) slicing framework; this optimization considers that FBSs’ location was already defined previously. This framework splits the physical radio resources into three RAN slices. These RAN slices schedule resources by optimizing individual slice spectral efficiency by using a deep reinforcement learning approach. The simulation indicates that the proposed framework generally outperforms the spectral efficiency of the network that only considers the heuristic predefined FBS location, although the gains are not always significant in some specific cases. Finally, spectral efficiency is analyzed for each RAN slice resource and evaluated in terms of service-level agreement (SLA) to indicate the performance of the framework.
- Published
- 2022
22. Deep reinforcement learning-PID based supervisor control method for indirect-contact heat transfer processes in energy systems.
- Author
-
Wang, Xuan, Cai, Jinwen, Wang, Rui, Shu, Gequn, Tian, Hua, Wang, Mingtao, and Yan, Bowen
- Subjects
- *
HEAT transfer , *REINFORCEMENT learning , *ENERGY transfer , *HEAT exchangers , *TEMPERATURE control , *RANKINE cycle - Abstract
Indirect-contact heat exchangers have been widely used in various energy systems, and the precise tracking control of important heat transfer parameters, such as temperature, is vital for safe and efficient operation. However, the high nonlinearity of heat transfer and large disturbance brings difficulty to optimal control. Considering the strong perception and decision-making capabilities of deep reinforcement learning (DRL), this study proposed a supervisor control method combined DRL and proportional–integral–derivative (PID). A set of the fewest conveniently measurable variables was derived as agent observations to describe the heat transfer process effectively and thereby improve the control efficiency under large disturbances. In addition, the local heat transfer process was used as a training environment to reduce training costs significantly. Finally, superheat temperature control in a complex organic Rankine cycle was simulated with SIMULINK to evaluate the effectiveness of the proposed observation variables and the training and control methods. The results showed that the proposed control method achieved satisfactory performance. The average absolute tracking error was only 0.246 K under trained and untrained disturbances, whereas that of the PID control was 4.645 K. Compared with the model predictive control, the DRL-PID-based supervisory control evidently performed better under a large disturbance; the average absolute tracking errors under DRL-PID control and MPC were 0.288 K and 0.509 K, respectively. [Display omitted] • A DRL-PID-based supervisor control method for heat transfer process is proposed. • A set of measurable variables is derived to describe the heat transfer state. • The local process in a complex system is used as the training environment. • The proposed control method performs better than MPC under large disturbance. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
23. Training an Artificial Bat: Modeling Sonar-based Obstacle Avoidance using Deep-reinforcement Learning
- Author
-
Mohan, Adithya Venkatesh
- Subjects
- Artificial Intelligence, Artificial-life, Bat, sonar-based, behavior cloning, Bio-inspired, AI, deep-reinforcement learning
- Abstract
Recent evidence suggests that sonar provides bats only with limited information about the environment. Nevertheless, they can fly swiftly through dense environments while avoiding obstacles. Previously, we proposed a model of sonar-based obstacle avoidance that only relied on the interaural level difference of the onset of the echoes. In this paper, we extend this previous model. In particular, we present a model that (1) is equipped with a short term memory of recent echo trains, and (2) uses the full echo train. Because handcrafting a controller to use more sonar data is challenging, we resort to machine learning to train a robotic model. We find that both extensions increase performance and conclude that these could be used to enhance our models of bat sonar behavior. We discuss the implications or our method and findings for both biology and bio-inspired engineering.
- Published
- 2020
24. Resource offload consolidation based on deep-reinforcement learning approach in cyber-physical systems.
- Abstract
In cyber-physical systems, it is advantageous to leverage cloud with edge resources to distribute the workload for processing and computing user data at the point of generation. Services offered by cloud are not flexible enough against variations in the size of underlying data, which leads to increased latency, violation of deadline and higher cost. On the other hand, resolving above-mentioned issues with edge devices with limited resources is also challenging. In this work, a novel reinforcement learning algorithm, Capacity-Cost Ratio-Reinforcement Learning (CCR-RL), is proposed which considers both resource utilization and cost for the target cyber-physical systems. In CCR-RL, the task offloading decision is made considering data arrival rate, edge device computation power, and underlying transmission capacity. Then, a deep learning model is created to allocate resources based on the underlying communication and computation rate. Moreover, new algorithms are proposed to regulate the allocation of communication and computation resources for the workload among edge devices and edge servers. The simulation results demonstrate that the proposed method can achieve a minimal latency and a reduced processing cost compared to the state-of-the-art schemes.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.