Cost-Driven Off-Loading for DNN-Based Applications Over Cloud, Edge, and End Devices

Bing Lin,Junqin Hu,Jun Li,Jianshan Zhang,Xing Chen,Yinhao Huang

doi:10.1109/tii.2019.2961237

Abstract

Currently, deep neural networks (DNNs) have achieved a great success in various applications. Traditional deployment for DNNs in the cloud may incur a prohibitively serious delay in transferring input data from the end devices to the cloud. To address this problem, the hybrid computing environments, consisting of the cloud, edge, and end devices, are adopted to offload DNN layers by combining the larger layers (more amount of data) in the cloud and the smaller layers (less amount of data) at the edge and end devices. A key issue in hybrid computing environments is how to minimize the system cost while accomplishing the offloaded layers with their deadline constraints. In this article, a self-adaptive discrete particle swarm optimization (PSO) algorithm using the genetic algorithm (GA) operators is proposed to reduce the system cost caused by data transmission and layer execution. This approach considers the characteristics of DNNs partitioning and layers off-loading over the cloud, edge, and end devices. The mutation operator and crossover operator of GA are adopted to avert the premature convergence of PSO, which distinctly reduces the system cost through enhanced population diversity of PSO. The proposed off-loading strategy is compared with benchmark solutions, and the results show that our strategy can effectively reduce the system cost of off-loading for DNN-based applications over the cloud, edge and end devices relative to the benchmarks.

Highlights

C ONTEMPORARILY, deep neural networks (DNNs) have achieved a great success in various applications, such as natural language processing, speech recognition, and computer vision [1]
The proposed off-loading strategy is compared with benchmark solutions, and the results show that our strategy can effectively reduce the system cost of off-loading for DNN-based applications over the cloud, edge and end devices relative to the benchmarks
We propose a self-adaptive particle swarm optimization (PSO) algorithm using the genetic algorithm (GA) operators (PSO-GA) to reduce the system cost caused by data transmission and layer execution, with the deadline constraints of all DNN-based applications

Summary

Introduction

C ONTEMPORARILY, deep neural networks (DNNs) have achieved a great success in various applications, such as natural language processing, speech recognition, and computer vision [1]. The number of Internet of Things (IoT) devices has increased dramatically These end devices, equipped with sensors (e.g., microphones, cameras, and gyroscopes) for obtaining a large amount of environment data, are usually attractive to machine learning (ML) applications [2]. These IoT devices with limited energy and computing resources cannot afford computation-intensive tasks (e.g., DNNs). DNNs are conventionally deployed in the cloud with powerful computation capability This results in a prohibitively serious delay when off-loading input data from sensors to DNNs in the cloud, due to the long distance between the cloud and IoT devices

Objectives

Results

Conclusion