Improving the Performance and Efficiency of Convolutional Neural Networks (CNNs) on Image Classification Tasks via Fixed-Point Quantization and Structured Pruning

Yingjie Song

doi:10.54097/cmc21d96

Yingjie Song

Open Access

PDF Available

https://doi.org/10.54097/cmc21d96

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

This study explores the combined use of fixed-point quantization and structured pruning to optimize the performance and efficiency of convolutional neural networks (CNNs) in image classification tasks. These techniques can be used to reduce model size and computational complexity, making CNNs more suitable for deployment in resource-constrained environments such as mobile devices and embedded systems. Fixed-point quantization methods can reduce the bit-width of weights and activations, thereby reducing the computational load and memory footprint. On the other hand, structured pruning systematically removes unimportant convolutional filters or channels, which further reduces the model size and increases the inference speed. An experimental evaluation was performed on the ImageNet dataset using the ResNet-50 architecture. The results show that the combined strategy of quantization and pruning reduces the model size by up to 75% and increases the inference speed by 50%, while maintaining a classification accuracy of 74.5%, compared to 76.4% for the baseline model. Considering the significant increase in efficiency, a slight decrease in accuracy is acceptable. The results show that the integrated approach effectively compresses and accelerates the CNN model without a significant drop in accuracy, making it ideal for real-time applications.

Full Text