It is gradually becoming popular to use multiple cellular-connected UAVs as inspectors to fulfill automatic surveillance and monitoring. However, in actual situations, UAVs should respond to several service requests from different hotspots, whilst the requests usually present randomness in the arrival time, data amount, and the concurrency. This paper proposes a dynamic dual-UAV response policy for multi-hotspot services based on single-agent deep Q-learning, where the UAVs controlled by a ground base station can be dispatched automatically to hotspots and then send videos back. First, this issue is formulated as an optimization problem, whose goal is to maximize the number of successfully served requests with the constraints of both the UAV’s energy limit and request waiting time. Second, a reward function based on service completion is designed to overcome the potential challenges posed by the delay reward. Finally, a simulation was conducted, comparing the conventional time priority algorithm and distance priority algorithm, respectively, to the proposed algorithm. The results illustrate that the proposed algorithm can achieve one more response than the others under different service densities, with the lowest failure number and appropriate average waiting time. This method can give a technical solution for the joint communication-and-control problem of multiple UAVs within complex situations.
Read full abstract