Abstract

New applications such as intelligent manufacturing, autonomous vehicles and smart cities drive large-scale deep learning models deployed in the Internet of Things (IoT) edge environments. However, deep learning models require substantial computations, storage and communication resources to run. It is generally difficult to deploy and execute a complete deep neural network (DNN) on a resource-constrained edge device. One possible solution is to slice the DNN into multiple tiles distributed to different edge devices, which can reduce the number of computations and quantity of data on each edge device. In this paper, we propose DISSEC, a distributed scheduling strategy for DNN inference on IoT edge clusters. DISSEC leverages spatial partitioning techniques through fusing the convolutional layers and dividing them into multiple partitions that can be executed independently, and proposes a method to express the dependencies between partitions. It further proposes a search algorithm based on heuristics to produce a distributed parallel strategy with the best overall inference execution latency. The evaluation shows that our strategy can fully utilize the edge device resources by cooperating with multiple edge devices to perform partitioning tasks in parallel. Furthermore, compared to the existing work scheduling strategy, our strategy reduces communication overhead by 20% and overall execution latency by 9% under different partitioning granularities and numbers of edge devices.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call