The Compression Techniques Applied on Deep Learning Model

Haoyuan He,Zisen Huang,Lingxuan Huang,Tiantian Yang

doi:10.54097/hset.v4i.920

Abstract

In recent years, the penetration rate of smartphones has gradually completed, artificial intelligence is the cutting-edge technology that can trigger disruptive changes. Deep learning neural networks are also starting to appear on mobile devices. In order to obtain better performance, more complex networks need to be designed, and the corresponding models, computation and storage space are increasing, however, the challenges of resource allocation and energy consumption still exist in mobile. The techniques for compressing deep learning models are quite important, and this paper studies a series of related literatures. This paper reviews deep learning-based deep neural network compression techniques and introduces the key operational points of knowledge extraction and network model on the learning performance of Resolution-Aware Knowledge Distillation. In this paper, a low-rank decomposition algorithm is evaluated based on sparse parameters and rank using the extended BIC for tuning parameter selection. This paper discusses the reduction of redundancy in the fully connected and constitutive layers of the training network model by pruning strategies.Moreover, this paper presents the quantization techniques and a neural network that quantifies weights and activations by applying differentiable nonlinear functions.

Full Text