Abstract

Unmanned aerial vehicles (UAVs) are capable of enhancing the coverage of existing cellular networks by acting as aerial base stations (ABSs). Due to the limited on-board battery capacity and dynamic topology of UAV networks, trajectory planning and interference coordination are crucial for providing satisfactory service, especially in emergency scenarios, where it is unrealistic to control all UAVs in a centralized manner by gathering global user information. Hence, we solve the decentralized joint trajectory and transmit power control problem of multi-UAV ABS networks. Our goal is to maximize the number of satisfied users, while minimizing the overall energy consumption of UAVs. To allow each UAV to adjust its position and transmit power solely based on local- rather the global-observations, a multi-agent reinforcement learning (MARL) framework is conceived. In order to overcome the non-stationarity issue of MARL and to endow the UAVs with distributed decision making capability, we resort to the centralized training in conjunction with decentralized execution paradigm. By judiciously designing the reward, we propose a decentralized joint trajectory and power control (DTPC) algorithm with significantly reduced complexity. Our simulation results show that the proposed DTPC algorithm outperforms the state-of-the-art deep reinforcement learning based methods, despite its low complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call