Abstract
Modern deep neural networks, particularly recent large language models, come with massive model sizes that require significant computational and storage resources. To enable the deployment of modern models on resource-constrained environments and to accelerate inference time, researchers have increasingly explored pruning techniques as a popular research direction in neural network compression. More than three thousand pruning papers have been published from 2020 to 2024. However, there is a dearth of up-to-date comprehensive review papers on pruning. To address this issue, in this survey, we provide a comprehensive review of existing research works on deep neural network pruning in a taxonomy of 1) universal/specific speedup, 2) when to prune, 3) how to prune, and 4) fusion of pruning and other compression techniques. We then provide a thorough comparative analysis of eight pairs of contrast settings for pruning (e.g., unstructured/structured, one-shot/iterative, data-free/data-driven, initialized/pre-trained weights, etc.) and explore several emerging topics, including pruning for large language models, vision transformers, diffusion models, and large multimodal models, post-training pruning, and different levels of supervision for pruning to shed light on the commonalities and differences of existing methods and lay the foundation for further method development. Finally, we provide some valuable recommendations on selecting pruning methods and prospect several promising research directions for neural network pruning. To facilitate future research on deep neural network pruning, we summarize broad pruning applications (e.g., adversarial robustness, natural language understanding, etc.) and build a curated collection of datasets, networks, and evaluations on different applications. We maintain a repository on https://github.com/hrcheng1066/awesome-pruning that serves as a comprehensive resource for neural network pruning papers and corresponding open-source codes. We will keep updating this repository to include the latest advancements in the field.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on pattern analysis and machine intelligence
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.