ABSTRACTIn multi‐access edge computing (MEC), computational task offloading of mobile terminals (MT) is expected to provide the green applications with the restriction of energy consumption and service latency. Nevertheless, the diverse statuses of a range of edge servers and mobile terminals, along with the fluctuating offloading routes, present a challenge in the realm of computational task offloading. In order to bolster green applications, we present an innovative computational task offloading model as our initial approach. In particular, the nascent model is constrained by energy consumption and service latency considerations: (1) Smart mobile terminals with computational capabilities could serve as carriers; (2) The diverse computational and communication capacities of edge servers have the potential to enhance the offloading process; (3) The unpredictable routing paths of mobile terminals and edge servers could result in varied information transmissions. We then propose an improved deep reinforcement learning (DRL) algorithm named PS‐DDPG with the prioritized experience replay (PER) and the stochastic weight averaging (SWA) mechanisms based on deep deterministic policy gradients (DDPG) to seek an optimal offloading mode, saving energy consumption. Next, we introduce an enhanced deep reinforcement learning (DRL) algorithm named PS‐DDPG, incorporating the prioritized experience replay (PER) and stochastic weight averaging (SWA) techniques rooted in deep deterministic policy gradients (DDPG). This approach aims to identify an efficient offloading strategy, thereby reducing energy consumption. Fortunately, algorithm is proposed for each MT, which is responsible for making decisions regarding task partition, channel allocation, and power transmission control. Our developed approach achieves the ultimate estimation of observed values and enhances memory via write operations. The replay buffer holds data from previous time slots to upgrade both the actor and critic networks, followed by a buffer reset. Comprehensive experiments validate the superior performance, including stability and convergence, of our algorithm when juxtaposed with prior studies.
Read full abstract