Memory-Free Stochastic Weight Averaging by One-Way Variational Pruning

Yooseung Wang,Jwajin Lee,Hyunseong Park

doi:10.1109/lsp.2021.3082036

Abstract

Recent works on convolutional neural networks (CNN) have attempted to find the local optima with ensemble-based approaches. Fast Geometric Ensemble (FGE) showed that captured weight points at the end of training time circulate local optima. This led to the Stochastic Weight Averaging (SWA) approach, which averages multiple model weights to find the local optima. However, they are limited by their output of fully-parameterized models, including needless parameters, after the training procedure. To solve this problem, we propose a novel training procedure: Stochastic Weight Averaging by One-way Variational Pruning (SWA-OVP). SWA-OVP reduces the number of model parameters by variationally updating the mask of weights for pruning. SWA-OVP variationally generates a mask for pruned weights in each iteration while recent pruning approaches produce the mask at the end of each training. In addition, our SWA-OVP prunes the model in a one-way training procedure, while other recent approaches prune the model weights in iterative training or require additional computation. Our experiment shows that SWA-OVP using only a 0.5x% $\sim$ 0.7x% parameter size achieves even higher accuracy than SWA and FGE on several networks, such as Pre-ResNet110, Pre-ResNet164 and WideResNet28x10 on CIFAR10 and CIFAR100 datasets. SWA-OVP also achieves better performance compared to state-of-the-art pruning approaches.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Memory-Free Stochastic Weight Averaging by One-Way Variational Pruning

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters

Lead the way for us

Journal: IEEE Signal Processing Letters	Publication Date: Jan 1, 2021
Citations: 1

Similar Papers

Research on improved convolutional wavelet neural network
Jingwei Liu ... Jiaxin Li
Scientific Reports | VOL. 11
Jingwei Liu, et. al.Jingwei Liu ... Jiaxin Li
09 Sep 2021
Scientific Reports | VOL. 11

Optimization of Convolutional Neural Network Using the Linearly Decreasing Weight Particle Swarm Optimization
...
Proceedings of the Annual Conference of JSAI | VOL. JSAI2022
, et. al. ...
16 Jan 2020
Proceedings of the Annual Conference of JSAI | VOL. JSAI2022

Uncertainty handling in convolutional neural networks.
Elyas Rashno ... Babak Nasersharif
Neural Computing and Applications | VOL. 34
Elyas Rashno, et. al.Elyas Rashno ... Babak Nasersharif
18 Jun 2022
Neural Computing and Applications | VOL. 34

Buffer Sizes Reduction for Memory-efficient CNN Inference on Mobile and Embedded Devices
Svetlana Minakova ... Todor Stefanov
-
Svetlana Minakova, et. al.Svetlana Minakova ... Todor Stefanov
01 Aug 2020
01 Aug 2020

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Memory-Free Stochastic Weight Averaging by One-Way Variational Pruning

Abstract

Talk to us

Similar Papers

More From: IEEE Signal Processing Letters