Abstract

Convolutional Neural Networks (CNN) are playing a big role in image classification, object detection, and segmentation tasks. The integration of several CNN models at the appropriate place can address the majority of computer vision problems. CNN model consists of millions of parameters that have to be trained on high-performance computational devices. Model compression is inevitable as high number of model parameters needs to be reduced without losing the model performance. This research study employ Discrete Wavelet Transforms (DWT) to decompose input images before running them through a set of fused convolutional layers in an existing model. Further, this study also attempts to minimize the accuracy drop due to the downsampling by replacing the conventional pooling operations with the integration of DWT. DWT also helps to minimize the information loss due to the downsampling. Along with DWT, Further, the models are compressed by using quantization techniques to make lightweight models, which can be easily deployed on mobile and edge devices. A Quantization Aware Training (QAT) is applied followed by the application of wavelets. Thus the modified models are referred as ’WaveQuant’ models. The commonly used CNN models such as AlexNet, GoogLeNet, and ResNets are tested on the CIFAR10 dataset, and accuracy results are comparable with their vanilla models. The modified LeNet model (WaveQuant LeNet) has only 13k parameters and it has a comparable accuracy on testing as well. Our all modified ’WaveQuant’ models have significantly lesser parameters than their original models and it makes models more lightweight.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call