Abstract

Computation offloading via device-to-device communications can improve the performance of mobile edge computing by exploiting the computing resources of user devices. However, most proposed optimization-based computation offloading schemes lack self-adaptive abilities in dynamic environments due to time-varying wireless environment, continuous-discrete mixed actions, and coordination among devices. The conventional reinforcement learning based approaches are not effective for solving an optimal sequential decision problem with continuous-discrete mixed actions. In this paper, we propose a hierarchical deep reinforcement learning (HDRL) framework to solve the joint computation offloading and resource allocation problem. The proposed HDRL framework has a hierarchical actor-critic architecture with a meta critic, multiple basic critics and actors. Specifically, a combination of deep Q-network (DQN) and deep deterministic policy gradient (DDPG) is exploited to cope with the continuous-discrete mixed action spaces. Furthermore, to handle the coordination among devices, the meta critic acts as a DQN to output the joint discrete action of all devices and each basic critic acts as the critic part of DDPG to evaluate the output of the corresponding actor. Simulation results show that the proposed HDRL algorithm can significantly reduce the task computation latency compared with baseline offloading schemes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.