Mobile edge computing (MEC) has recently emerged as an effective paradigm for computation-intensive and delay-critical applications supported by Internet of Things (IoT) devices. However, the computational resources at MEC servers are normally much smaller compared with the remote cloud server (CS). To handle the ever-increasing IoT devices and applications, the collaborative edge and cloud computing should be jointly exploited. Further, with finite block length (FBL) utilized in URLLC-supported networks, the coding error probability cannot be ignored. On these grounds, this paper investigates the dynamic offloading of FBL packets in an edge-cloud collaborative MEC system consisting of multi-mobile IoT devices (MIDs) with energy harvesting (EH), multi-edge servers, and one CS in a dynamic environment. The optimization problem is formulated to minimize the average long-term service cost defined as the weighted sum of MID energy consumption and service delay (including the uploading transmission delay, the handover cost, and the execution delay for the partial offloaded part, the local execution delay for the local processing part), with the constraints of the available resource, the energy causality, the allowable service delay, and the maximum decoding error probability. To address the problem involving both discrete and continuous variables, we propose a multi-device hybrid decision-based deep reinforcement learning (DRL) solution, named DDPG-D3QN algorithm, where the deep deterministic policy gradient (DDPG) and dueling double deep Q networks (D3QN) are invoked to tackle continuous and discrete action domains, respectively. Specifically, we improve the actor-critic structure of DDPG by combining D3QN. It utilizes the actor part of DDPG to search for the optimal offloading rate and power control of local execution. Meanwhile, it combines the critic part of DDPG with D3QN to select the optimal server for offloading. Simulation results demonstrate the proposed DDPG-D3QN algorithm has better stability and faster convergence while achieving higher rewards than the existing DRL-based methods. Furthermore, the edge-cloud collaboration has been proven to attain improved performance when compared with other offloading schemes with no collaboration between edge and cloud.
Read full abstract