Abstract

Deep convolutional neural networks (CNNs) have been successful in many tasks in machine vision, however, millions of weights in the form of thousands of convolutional filters in CNNs make them difficult for human interpretation or understanding in science. In this article, we introduce a greedy structural compression scheme to obtain smaller and more interpretable CNNs, while achieving close to original accuracy. The compression is based on pruning filters with the least contribution to the classification accuracy or the lowest Classification Accuracy Reduction (CAR) importance index. We demonstrate the interpretability of CAR-compressed CNNs by showing that our algorithm prunes filters with visually redundant functionalities such as color filters. These compressed networks are easier to interpret because they retain the filter diversity of uncompressed networks with an order of magnitude fewer filters. Finally, a variant of CAR is introduced to quantify the importance of each image category to each CNN filter. Specifically, the most and the least important class labels are shown to be meaningful interpretations of each filter.

Highlights

  • Deep convolutional neural networks (CNNs) achieve state-of-the-art performance for a wide variety of tasks in computer vision, such as image classification and segmentation (Krizhevsky et al, 2012; Long et al, 2015)

  • Achieving a state-of-the-art compression ratio is not the main goal in this paper, we show that our classification accuracy reduction (CAR) structural compression scheme achieves higher classification accuracy in a hold-out test set compared to the baseline structural compression methods

  • Pruning half of the filters in either of the individual convolutional layers in AlexNet, our CAR algorithm achieves 16% to 25% higher classification accuracies compared to the best benchmark filter pruning scheme

Read more

Summary

Introduction

Deep convolutional neural networks (CNNs) achieve state-of-the-art performance for a wide variety of tasks in computer vision, such as image classification and segmentation (Krizhevsky et al, 2012; Long et al, 2015). CNNs are widely employed in many data-driven platforms such as cellphones, smartwatches, and robots. While the huge number of weights and convolutional filters in deep CNNs is a key factor in their success, it makes them hard or impossible to interpret in general and especially for scientific and medical applications (Montavon et al, 2017; Abbasi-Asl et al, 2018). Compressing CNNs or reducing the number of weights, while keeping prediction performance, facilitates interpretation, and understanding in science and medicine. Compression benefits the use of CNNs in platforms with limited memory and computational power

Objectives
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call