Adaptive Distributed Convolutional Neural Network Inference at the Network Edge with ADCNN

Sai Qian Zhang,Qi Zhang,Jieyu Lin

doi:10.1145/3404397.3404473

Abstract

The emergence of the Internet of Things (IoT) has led to a remarkable increase in the volume of data generated at the network edge. In order to support real-time smart IoT applications, massive amounts of data generated from edge devices need to be processed using methods such as deep neural networks (DNNs) with low latency. To improve application performance and minimize resource cost, enterprises have begun to adopt Edge computing, a computation paradigm that advocates processing input data locally at the network edge. However, as edge nodes are often resource-constrained, running data-intensive DNN inference tasks on each individual edge node often incurs high latency, which seriously limits the practicality and effectiveness of this model. In this paper, we study the problem of distributed execution of inference tasks on edge clusters for Convolutional Neural Networks (CNNs), one of the most prominent models of DNN. Unlike previous work, we present Fully Decomposable Spatial Partition (FDSP), which naturally supports resource heterogeneity and dynamicity in edge computing environments. We then present a compression technique that further reduces network communication overhead. Our system, called ADCNN, provides up to 2.8 × speed up compared to state-of-the-art approaches, while achieving a competitive inference accuracy.

Full Text