Abstract

Deep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification, for embedded edge devices. This work presents a hardware-based DNN compression approach to address the limited memory resources in edge devices. We propose a new entropy-based compression algorithm for encoding DNN weights, as well as a real-time decoding method and efficient dedicated hardware implementation. The proposed approach enables a significant reduction of the required DNN weights memory (approximately 70% and 63% for AlexNet and VGG19, respectively), while allowing the decoding of one weight per clock cycle. Results show a high compression ratio compared to well-known lossless compression algorithms. The proposed hardware decoder enables an efficient implementation of large DNN networks in low-power edge devices with limited memory resources.

Highlights

  • Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP

  • This method provides for a local solution to enable DNN implementation on edge devices

  • We propose an efficient hardware decoder implementation that can be integrated into a DNN hardware accelerator

Read more

Summary

INTRODUCTION

Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP. Additional memory is needed for internal network temporal arithmetic computations Because of these extensive memory requirements, deploying DNNs on mobile devices presents a unique challenge for low-power and efficient implementation in embedded systems with limited memory space. The translation task is carried out by sending the text from the local device to the cloud for processing This approach increases execution time and requires an expensive continuous communication line, which is not always feasible. This paper proposes an efficient entropy-based lossless compression and decompression algorithms and a real-time hardware decoder implementation method. This method provides for a local solution to enable DNN implementation on edge devices.

AND RELATED WORK
A DETAILED EXAMPLE OF THE COMPRESSION ALGORITHM
EXPERIMENTAL AND RESULTS
CONCLUSION
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.