A Dynamic Resource Optimization Scheme for MEC Task Offloading Based on Policy Gradient

Yiquan Li,Xue Tang,Miaoxin Deng,Wenzao Li,Chenxi Yang

doi:10.1109/itoec53115.2022.9734566

Abstract

As an effective tool, reinforcement learning (RL) has attracted much attention in the field of mobile edge computing (MEC). For MEC task offloading, we hope to find a high-quality task offloading strategy faster. The Policy Gradient (PG) algorithm, as one of the RL algorithms, is known for its fast convergence. And the PG algorithm does not need to consider the state transition, and can run directly to get the result. In a queuing task scenario of a single terminal and a single edge server, the PG algorithm can quickly obtain a high-quality offloading scheme. The Greedy algorithm is also a commonly used decision-making method in MEC task offloading. Thus, we use the Greedy algorithm as the experimental control group and compare it with the exhaustive algorithm. Through the simulation platform, it can be judged that on the basis of randomly taking the initial value, the PG algorithm can save more than 50% of the overhead. Although the Greedy algorithm has advantages when the number of tasks is small, as the number of tasks increases, the overhead of the Greedy algorithm will become higher and higher due to the long decision time. Therefore, the PG algorithm is more effective in our scenario, which can obtain a high-quality offloading scheme through a shorter decision time. complexity.

Full Text