Mobile edge computing (MEC) provides an economical way for the resource-constrained edge users to offload computational workload to MEC servers co-located with the access point (AP). In this article, we consider a hybrid computation offloading scheme that allows edge users to offload workloads by using active RF communications and backscatter communications. We aim to maximize the overall energy efficiency subject to the completion of all workload by jointly optimizing the AP's beamforming and the users' offloading decisions. Considering a dynamic environment, we propose a hierarchical multi-agent deep reinforcement learning (H-MADRL) framework to solve this problem. The high-level agent resides in the AP and optimizes the beamforming strategy, while the low-level user agents learn and adapt individuals' offloading strategies. To further improve the learning efficiency, we propose a novel optimization-driven learning algorithm that allows the AP to estimate the low-level users' actions by solving approximate optimization problem efficiently. Then, the action estimation can be shared with all users and drive them to update individuals' actions independently. Simulation results reveal that our algorithm can improve the system performance by 50%. The learning efficiency and reliability are also improved significantly comparing to the model-free learning methods.
Read full abstract