Unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) is a promising technology to provide computational services to ground terminal devices (TDs) in remote areas or for emergency incidents with its flexibility and mobility. This paper aims to maximize the UAV’s energy efficiency while considering the fairness of offloading. We formulate this optimization problem by jointly considering the UAV flight time, the UAV 3D trajectory, the TD binary offloading decisions, and the time allocated to TDs, which is a mixed-integer nonlinear programming problem. The problem is transformed into two sub-problems and we propose a PDDQNLP (parametrized dueling deep Q-network and linear programming) algorithm based on the combination of deep reinforcement learning (DRL) and linear programming (LP) to address them. For the first sub-problem, a DRL-based algorithm is used to optimize the TD offloading decisions, the UAV trajectory, and the UAV flight time. The action space is hybrid that contains discrete actions (e.g., binary offloading) and continuous actions (e.g., UAV flight time). Therefore, we parameterize the action space and propose the PDDQN algorithm, which combines the DDPG algorithm for handling the continuous action space and the Dueling DQN algorithm for handling the discrete action space. According to the solution obtained from the first sub-problem, the LP is used to adjust the time allocated to TDs in the second sub-problem. The numerical results show that the PDDQNLP algorithm outperforms its counterparts.
Read full abstract