Abstract

Recently a great number of ubiquitous Internet-of-Things (IoT) devices have been connecting to the Internet. With the massive amount of IoT data, the cloud-based intelligent applications have sprang up to support accurate monitoring and decision-making. In practice, however, the intrinsic transport bottleneck of the Internet severely handicaps the real-time performance of the cloud-based intelligence depending on IoT data. In the past few years, researchers have paid attention to the computing paradigm of edge-cloud collaboration; they offload the computing tasks from the cloud to the edge environment, in order to avoid transmitting much data through the Internet to the cloud. To present, it is still an open issue to effectively allocate the deep learning task (i.e., deep neural network computation) over the edge-cloud system to shorten the response time of application. In this paper, we propose the latency-minimum allocation (LMA) problem, aimed at allocating the deep neural network (DNN) layers over the edge-cloud environment while the total latency of processing this DNN can be minimized. First, we formalize the LMA problem in general form, prove its NP-hardness, and present an insightful characteristic of feasible DNN layer allocations. Second, we design an approximate algorithm, called CoEdge, which can handle the LMA problem in polynomial time. By exploiting the communication and computation resources of the edge, CoEdge greedily selects the beneficial edge nodes and allocates the DNN layers to the selected nodes by a recursion-based policy. Finally, we conduct extensive simulation experiments with realistic setups, and the experimental results show the efficacy of CoEdge in reducing the deep learning latency compared to two state-of-the-art schemes.

Highlights

  • In the past few years, the popularity of emerging Internet-ofThings (IoT) applications has been generating a huge amount of real-time data

  • Cloud computing has played an indispensable role in executing large-scale and intensive-computing tasks, such as deep learning, and the intelligence of the IoT applications usually resides in the cloud [1], [2]

  • As a matter of fact, the nonnegligible transmission delay of the Internet has become the essential obstacle to expediting the deep learning-based IoT applications [7]–[9]

Read more

Summary

Introduction

In the past few years, the popularity of emerging Internet-ofThings (IoT) applications has been generating a huge amount of real-time data. Based on the IoT data, the IoT end-users can make accurate monitoring and effective decisions with their intelligent utility deployed in cloud For such cloudbased intelligent IoT applications, the data generated by the ubiquitous IoT devices has to be delivered, through the Internet, to the cloud for further processing. Cloud computing has played an indispensable role in executing large-scale and intensive-computing tasks, such as deep learning, and the intelligence of the IoT applications usually resides in the cloud [1], [2]. Under such a cloud-centric paradigm, the data delivered from IoT devices. As a matter of fact, the nonnegligible transmission delay of the Internet has become the essential obstacle to expediting the deep learning-based IoT applications [7]–[9]

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call