Abstract

In next-generation wireless networks, high-mobility unmanned aerial vehicles (UAVs) are promising to provide content coverage, where users can receive sufficient requested content within a given time. However, trajectory planning for multiple UAVs to provide content coverage is challenging since 1) UAVs cannot provide content coverage for all users due to the limited energy and caching storage, and 2) the trajectory planning of UAV is coupled with each other. Moreover, the complete information based trajectory planning methods are unusable since UAVs cannot obtain prior information on the rapidly changing environment. In this paper, we investigate the multi-UAV trajectory planning for energy-efficient content coverage. We first formulate an energy efficiency maximization problem considering recharging scheduling, which aims to reduce the total length of trajectories of UAVs under the quality of service (QoS) constraints. To settle environment uncertainty, the trajectory planning problem is modeled as two coupled multi-agent stochastic games, whose equilibrium constitute the optimal trajectory. To obtain the equilibrium, we propose a decentralized reinforcement learning algorithm, which can decouple the two games. We prove that the proposed algorithm can converge to the optimal solution of the Bellman equation with a higher rate compared to the centralized one. Moreover, simulation results show that the energy efficiency of the proposed algorithm is smaller than 5% compared the optimal, which is obtained with the prior information of environments.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call