A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations.

Hongrong Cheng,Miao Zhang,Javen Qinfeng Shi

doi:10.1109/tpami.2024.3447085

Abstract

Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on pattern analysis and machine intelligence

Lead the way for us

Journal: IEEE transactions on pattern analysis and machine intelligence	Publication Date: Jan 1, 2024
Citations: 4

Similar Papers

Pruning in the Face of Adversaries
Florian Merkle ... Pascal Schöttle
-
Florian Merkle, et. al.Florian Merkle ... Pascal Schöttle
01 Jan 2021
01 Jan 2021

Large Language Models are Good Translators
Zhaohan Zeng ... Zhibin Liang
Journal of Emerging Investigators | VOL. -
Zhaohan Zeng, et. al.Zhaohan Zeng ... Zhibin Liang
01 Jan 2024
Journal of Emerging Investigators | VOL. -

Multiobjective evolutionary pruning of Deep Neural Networks with Transfer Learning for improving their performance and robustness
Javier Poyatos ... Francisco Herrera
Applied Soft Computing | VOL. 147
Javier Poyatos, et. al.Javier Poyatos ... Francisco Herrera
15 Aug 2023
Applied Soft Computing | VOL. 147

Hybrid tensor decomposition in neural network compression
Bijiao Wu ... Guoqi Li
Neural Networks | VOL. 132
Bijiao Wu, et. al.Bijiao Wu ... Guoqi Li
19 Sep 2020
Neural Networks | VOL. 132

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Survey on Deep Neural Network Pruning: Taxonomy, Comparison, Analysis, and Recommendations.

Abstract

Talk to us

Similar Papers

More From: IEEE transactions on pattern analysis and machine intelligence