Abstract

With the advance of deep neural networks (DNNs), artificial intelligence (AI) has been widely applied to various applications in our daily lives. These DNN-based models can be stored in portable storage disks or low-power Neural Compute Sticks. They can then be deployed in edge devices through the USB interface for AI-based applications, such as Automatic Diagnosis Systems or Smart Surveillance Systems, which provides solutions to incorporating AI into the Internet of Things (IoT). In this work, based on our observation and careful analysis, we propose a model-based deep encoding method built upon Huffman coding to compress a DNN model transmitted through the USB interface to edge devices. Based on the proposed lopsidedness estimation approach, we can exploit a modified Huffman coding method to increase the USB transmission efficiency for quantized DNN models while reducing the computational cost entailed by the coding process. We conducted experiments on several benchmarking DNN models compressed using three emerging quantization techniques, which indicates that our method can achieve a high compression ratio of 88.72%, with 93.76% of the stuffing bits saved on average.

Highlights

  • In recent years, the demand for internet-connected devices or Internet-of-Things (IoT) has been grown drastically

  • We propose a Model-based Deep Encoding (MDE) method to further optimize transmission efficiency for deep neural networks (DNNs) models deployed via the USB interface to edge devices

  • In the paper, we propose a model-based deep encoding (MDE) pipeline built upon Huffman coding to compress a DNN model transmitted through the USB interface to edge devices

Read more

Summary

Introduction

The demand for internet-connected devices or Internet-of-Things (IoT) has been grown drastically. With the advance of deep-learning technology used in Computer Vision (CV) and Natural Language Processing (NLP), migrating various deep-learning-based CV and NLP applications to the Internet of Things (IoTs) attracts more attention [1]. Such applications usually require massive data transmission to/from the cloud from/to IoT devices, mostly relying on cloud computing for running deep-learning models. To save on cloud computation costs and preserve data privacy, using edge computing to analyze or process data using deep neural network models are pervasive nowadays. Edge computing can help reduce the amount of data transmitted to the cloud to avoid long latency [2].

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call