Comparative Study of Pruning Techniques in Recurrent Neural Networks

Sagar Choudhury,Biju R Mohan,Pragnesh Thaker,Asis Kumar Rout

doi:10.1007/978-981-99-0981-0_32

Abstract

In recent years, there has been a drastic development in the field of neural networks. They have evolved from simple feed-forward neural networks to more complex neural networks such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs). CNNs are used for tasks such as image recognition where the sequence is not essential, while RNNs are useful when order is important such as machine translation. By increasing the number of layers in the network, we can improve the performance of the neural network (Alford et al. in Pruned and structurally sparse neural networks, 2018 [1]). However, this will also increase the complexity of the network, and also training will require more power and time. By introducing sparsity in the architecture of the neural network, we can tackle this problem. Pruning is one of the processes through which a neural network can be made sparse (Zhu and Gupta in To prune, or not to prune: exploring the efficacy of pruning for model compression, 2017 [2]). Sparse RNNs can be easily implemented on mobile devices and resource-constraint servers (Wen et al. in Learning intrinsic sparse structures within long short-term memory, 2017 [3]). We investigate the following methods to induce sparsity in RNNs: RNN pruning and automated gradual pruning. We also investigate how the pruning techniques impact the model’s performance and provide a detailed comparison between the two techniques. We also experiment by pruning input-to-hidden and hidden-to-hidden weights. Based on the results of pruning experiments, we conclude that it is possible to reduce the complexity of RNNs by more than 80%.

Full Text