Abstract

Recently, deep learning (DL) models have excelled in a wide range of fields. All of these successes are built on intricate DL models. The hundreds of millions or even billions of parameters and high-performance computing graphical processing units or tensor processing units are largely responsible for their achievement. DL model integration into real-time devices with tight latency limitations, limited memory, and power-constrained requirements is the key driving force behind investigation of DL model compression techniques. Also, there is an increase in data availability that encourages multimodal fusion in DL models to boost the models' predictive accuracy. In order to create compact DL models for deployment that is memory- and computationally efficient, the data included in the network parameters is compressed as much as possible, leaving only the bits necessary to carry out the task. A better trade-off between compression rate and accuracy loss should be established to take model acceleration and compression into consideration without severely reducing the model's performance. In this paper, we examine various DL model compression techniques used for both single- modality and multi-modal deep learning tasks. We explore over numerous DL model compression methods that have advanced in a number of applications. We then come up with the benefits and drawbacks of various compression and acceleration methods such as ineffectiveness in compressing more complicated networks with dimensionality-dependent complex structures, and, ultimately, the field's future prospects are given.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.