Abstract
Deep Neural Networks (DNN) are widely applied to many mobile applications demanding real-time implementation and large memory space. Therefore, it presents a new challenge for low-power and efficient implementation of a diversity of applications, such as speech recognition and image classification, for embedded edge devices. This work presents a hardware-based DNN compression approach to address the limited memory resources in edge devices. We propose a new entropy-based compression algorithm for encoding DNN weights, as well as a real-time decoding method and efficient dedicated hardware implementation. The proposed approach enables a significant reduction of the required DNN weights memory (approximately 70% and 63% for AlexNet and VGG19, respectively), while allowing the decoding of one weight per clock cycle. Results show a high compression ratio compared to well-known lossless compression algorithms. The proposed hardware decoder enables an efficient implementation of large DNN networks in low-power edge devices with limited memory resources.
Highlights
Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP
This method provides for a local solution to enable DNN implementation on edge devices
We propose an efficient hardware decoder implementation that can be integrated into a DNN hardware accelerator
Summary
Deep Neural Networks ( DNNs) have become a powerful tool for many Artificial Intelligence (AI) applications such as computer vision, robotics, and NLP. Additional memory is needed for internal network temporal arithmetic computations Because of these extensive memory requirements, deploying DNNs on mobile devices presents a unique challenge for low-power and efficient implementation in embedded systems with limited memory space. The translation task is carried out by sending the text from the local device to the cloud for processing This approach increases execution time and requires an expensive continuous communication line, which is not always feasible. This paper proposes an efficient entropy-based lossless compression and decompression algorithms and a real-time hardware decoder implementation method. This method provides for a local solution to enable DNN implementation on edge devices.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.