Joint Air-to-Ground Scheduling in UAV-Aided Vehicular Communication: A DRL Approach With Partial Observations

Chaowei Wang,Danhao Deng,Weidong Wang

doi:10.1109/lcomm.2022.3167110

Abstract

The unmanned aerial vehicle (UAV)-aided vehicular communication can be greatly facilitated by joint optimization of UAV trajectory design and vehicle assignment. While most existing works are based on the full observation of system state, we consider the partial observability with predicting the vehicles’ trajectories. The vehicle trajectory prediction and joint optimization problem are modeled as a Partially-Observable Markov Decision Process (POMDP). To deal with the non-Markovian of the POMDP, we construct a new deep recurrent Q-network (DRQN) framework based on deep Q-network (DQN) algorithm and Long Short Term Memory (LSTM) layer. Simulation results demonstrate that the proposed DRQN-based scheme is fast convergent and outperforms the baseline schemes in terms of the sum spectral efficiency.

Full Text