Abstract

Unmanned aerial vehicles (UAVs) can provide flexible network coverage services. UAVs can be applied in a large number of scenarios, such as emergency communication and network access in areas without terrestrial network coverage. However, UAVs are limited to relatively short communication range and restricted energy resources. In extreme conditions such as disasters, there may also be a problem that the communication bandwidth is limited and the UAV cannot communicate with the server with a large amount of information, so a decentralized solution is expected. In addition, the interaction between multiple objectives and multiple UAVs leads to a huge state space, which makes large-scale practical applications difficult. To simplify complex interactions, we modeled the UAV control problem with mean-field game (MFG). We propose a new UAV control method, the mean-field trust region policy optimization (MFTRPO), which uses the MFG method to construct the Hamilton-Jacobi-Bellman/Fokker-Planck-Kolmogorov equation that obtains the optimal solution and solves the difficulties in the practical application through the trust region policy optimization and neural network feature embedding methods. The proposed method: 1) maximizes communication efficiency while ensuring fair communication range and network connectivity; 2) fuses the mean-field theory with deep reinforcement learning techniques; and 3) is scalable and adaptive. We conduct extensive simulations for performance evaluation. The simulation results have shown that MFTRPO significantly and consistently outperforms two commonly used baseline methods in terms of coverage, fairness, and energy consumption.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call