Abstract

In this paper, we consider an unmanned aerial vehicle (UAV) enabled wireless network with a set of ground devices that are randomly distributed in an area and each having a certain amount of data for transmission. The UAV flies over this region from a starting point to a destination. During its flight, the UAV wants to communicate to the ground devices for maximizing the cumulative collected data by optimizing the trajectory of the UAV subject to its flight time constraint. Due to uncertainty in the locations of the ground devices and the communication dynamics, an accurate system model is difficult to acquire and maintain. With the help of stochastic modelling, we present a reinforcement learning based automated trajectory optimization algorithm. By dividing the considered region into small grids with finite state space and action space, we apply the Q-learning based automated trajectory optimization approach for maximizing the cumulative collected data during its flight time. Simulation results demonstrate that the reinforcement learning approach can find an optimal strategy under the flight time constraint.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call