Multi-Agent Reinforcement Learning Trajectory Design and Two-Stage Resource Management in CoMP UAV VLC Networks.

Authors :: Maleki, Mohammad Reza
Mili, Mohammad Robat
Javan, Mohammad Reza
Mokari, Nader
Jorswieck, Eduard A.
Source :: IEEE Transactions on Communications. Nov2022, Vol. 70 Issue 11, p7464-7476. 13p.
Publication Year :: 2022
Abstract: In this paper, we consider unmanned aerial vehicles (UAVs) equipped with a visible light communication (VLC) access point and coordinated multipoint (CoMP) capability that allows users to connect to more than one UAV. UAVs can move in 3-dimensional (3D) at a constant acceleration, where a central server is responsible for synchronization and cooperation among UAVs. The effect of accelerated movement in UAV is necessary to be considered. Unlike most existing works, we examine the effects of variable speed on kinetics and radio resource allocations. For the proposed system model, we define two different time scales. In the frame, the acceleration of each UAV is specified, and in each slot, radio resources are allocated. Our goal is to formulate a multi-objective optimization problem where the total data rate is maximized, and the total communication power consumption is minimized simultaneously. To handle this multi-objective optimization, we first apply the scalarization method and then apply multi-agent deep deterministic policy gradient (MADDPG). We improve this solution method by adding two critic networks together with two-stage resource allocation. [ABSTRACT FROM AUTHOR]

Subjects :: *RESOURCE management
*REINFORCEMENT learning
*OPTICAL communications
*RESOURCE allocation
*VISIBLE spectra

Full Text Access

Tools