Abstract
The unmanned aerial vehicle (UAV)-aided vehicular communication can be greatly facilitated by joint optimization of UAV trajectory design and vehicle assignment. While most existing works are based on the full observation of system state, we consider the partial observability with predicting the vehicles’ trajectories. The vehicle trajectory prediction and joint optimization problem are modeled as a Partially-Observable Markov Decision Process (POMDP). To deal with the non-Markovian of the POMDP, we construct a new deep recurrent Q-network (DRQN) framework based on deep Q-network (DQN) algorithm and Long Short Term Memory (LSTM) layer. Simulation results demonstrate that the proposed DRQN-based scheme is fast convergent and outperforms the baseline schemes in terms of the sum spectral efficiency.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have