Abstract

In the past decade, Convolutional Neural Networks (CNNs) have achieved tremendous success in solving complex classification problems. CNN architectures require an excessive number of computations to achieve high accuracy. However, these models are deficient due to the heavy cost of storage and energy, which prohibits the application of CNNs to resource-constrained edge-devices. Hence, developing aggressive optimization schemes for efficient deployment of CNNs on edge devices has become the most important requirement. To find the optimal approach, we present a resource-limited environment based memory-efficient network compression model for image-level object classification. The main aim is to compress CNN architecture by achieving low computational cost and memory requirements without dropping system’s accuracy. To achieve the said goal, we propose a network compression strategy, that works in a collaborative manner, where Soft Filter Pruning is first applied to reduce the computational cost of the model. In the next step, the model is divided into No-Pruning Layers (NP-Layers) and Pruning Layers (P-Layers). Incremental Quantization is applied to P-Layers due to irregular weights distribution, while for NP-Layers, we propose a novel Optimized Quantization algorithm for the quantization of weights up to optimal levels obtained from the Optimizer. This scheme is designed to achieve the best trade-off between compression ratio and accuracy of the model. Our proposed system is validated for image-level object classification on LeNet-5, CIFAR-quick, and VGG-16 networks using MNIST, CIFAR-10, and ImageNet ILSVRC2012 datasets respectively. We have achieved high compression ratio with negligible accuracy drop, outperforming the state-of-the-art methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call