Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression

Tomer Malach,Moshe Haiut,Shlomo Greenberg

doi:10.1109/access.2020.3037254

Tomer Malach, Moshe Haiut + Show 1 more

Open Access

https://doi.org/10.1109/access.2020.3037254

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 23	License type: CC BY 4.0

Affiliation: Ben-Gurion University of the Negev

Abstract

Deep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification, for embedded edge devices. This work presents a hardware-based DNN compression approach to address the limited memory resources in edge devices. We propose a new entropy-based compression algorithm for encoding DNN weights, as well as a real-time decoding method and efficient dedicated hardware implementation. The proposed approach enables a significant reduction of the required DNN weights memory (approximately 70% and 63% for AlexNet and VGG19, respectively), while allowing the decoding of one weight per clock cycle. Results show a high compression ratio compared to well-known lossless compression algorithms. The proposed hardware decoder enables an efficient implementation of large DNN networks in low-power edge devices with limited memory resources.

Highlights

Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP
This method provides for a local solution to enable DNN implementation on edge devices
We propose an efficient hardware decoder implementation that can be integrated into a DNN hardware accelerator

Summary

INTRODUCTION

Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP. Additional memory is needed for internal network temporal arithmetic computations Because of these extensive memory requirements, deploying DNNs on mobile devices presents a unique challenge for low-power and efficient implementation in embedded systems with limited memory space. The translation task is carried out by sending the text from the local device to the cloud for processing This approach increases execution time and requires an expensive continuous communication line, which is not always feasible. This paper proposes an efficient entropy-based lossless compression and decompression algorithms and a real-time hardware decoder implementation method. This method provides for a local solution to enable DNN implementation on edge devices.

AND RELATED WORK

A DETAILED EXAMPLE OF THE COMPRESSION ALGORITHM

EXPERIMENTAL AND RESULTS

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

A fast and efficient pre-training method based on layer-by-layer maximum discrimination for deep neural networks
Seyyede Zohreh Seyyedsalehi ... Seyyed Ali Seyyedsalehi
Neurocomputing | VOL. 168
Seyyede Zohreh Seyyedsalehi, et. al.Seyyede Zohreh Seyyedsalehi ... Seyyed Ali Seyyedsalehi
22 May 2015
Neurocomputing | VOL. 168

CovidDeep: SARS-CoV-2/COVID-19 Test Based on Wearable Medical Sensors and Efficient Neural Networks
Shayan Hassantabar ... Raffaele Bruno
IEEE Transactions on Consumer Electronics | VOL. 67
Shayan Hassantabar, et. al.Shayan Hassantabar ... Raffaele Bruno
01 Nov 2021
IEEE Transactions on Consumer Electronics | VOL. 67

PArtNNer: Platform-Agnostic Adaptive Edge-Cloud DNN Partitioning for Minimizing End-to-End Latency
Soumendu Kumar Ghosh ... Arnab Raha
ACM Transactions on Embedded Computing Systems | VOL. 23
Soumendu Kumar Ghosh, et. al.Soumendu Kumar Ghosh ... Arnab Raha
10 Jan 2024
ACM Transactions on Embedded Computing Systems | VOL. 23

Deep Neural Networks-Based Weight Approximation and Computation Reuse for 2-D Image Classification
Mohammed F Tolba ... Huruy Tekle Tesfai
IEEE Access | VOL. 10
Mohammed F Tolba, et. al.Mohammed F Tolba ... Huruy Tekle Tesfai
01 Jan 2021
IEEE Access | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Hardware-Based Real-Time Deep Neural Network Lossless Weights Compression

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access