Abstract
With rapid development of technologies and growing number of application of neural networks, the problem of optimization arises. Among other methods to optimize training and inference time, neural network pruning has attracted attention in recent years. The main goal of pruning is to reduce the computational complexity of neural network models while retaining performance metrics on desired level. Among the various approaches to pruning, Single-shot Network Pruning (SNIP) methods was designed as a straightforward and effective approach to optimize number of parameters before training. However, as neural network architectures have evolved, particularly with the growing popularity of transformers, a need to reevaluate traditional pruning methods arises. This paper aims to revisit SNIP pruning method, evaluate its performance on transformer model, and introduce an enhanced version of SNIP, specifically designed for transformer architectures. The paper outlines the mathematical framework of SNIP algorithm, and proposes a modification, based on specifics of transformers models. Transformer models achieved impressive results because of their attention mechanisms for a multitude of tasks such as language modeling, translation, computer vision tasks and many others. The proposed modification takes into account this unique feature and combines this information with traditional loss gradients. Traditional method calculates importance score for weights of the network using only gradients from loss function, in the case of enhanced algorithm. In the enhanced version, the importance score is a composite metric that incorporates not only the gradient from the loss function but also from the attention activations. To evaluate the efficiency of proposed modifications, a series of experiments were conducted on image classification task, using Linformer variation of transformer architectures. The results of experiments demonstrate the efficiency of incorporating attention scores in pruning. Conducted experiments show that model pruned by modified algorithm outperforms model pruned by original SNIP by 34% in validation accuracy, confirming the validity of the improvements introduced.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have