Neural network pruning is a popular approach to reducing model storage size and inference time by removing redundant parameters in the neural network. However, the uncertainty of predictions from pruned models is unexplored. In this paper we study neural network pruning in the context of conformal predictors (CP). The conformal prediction framework built on top of machine learning algorithms supplements their predictions with reliable uncertainty measure in the form of prediction sets, under the independent and identically distributed assumption on the data. Convolutional neural networks (CNNs) have complicated architectures and are widely used in various applications nowadays. Therefore, we focus on pruning CNNs and, in particular, filter-level pruning. We first propose a brute force method that estimates the contribution of a filter to the CP’s predictive efficiency and removes those with the least contribution. Given the computation inefficiency of the brute force method, we also propose the Taylor expansion to approximate the filter’s contribution. Furthermore, we improve the global pruning method by protecting the most important filters within each layer from being pruned. In addition, we explore the ConfTr loss function which is optimized to yield maximal CP efficiency in the context of neural network pruning. We have conducted extensive experimental studies and compared the results regarding the trade-offs between predictive efficiency, computational efficiency, and network sparsity. These results are instructive for deploying pruned neural networks with applications using conformal prediction where reliable predictions and reduced computational cost are relevant, such as in safety-critical applications.
Read full abstract