Abstract

Neural network pruning has been deemed essential in the deployment of deep neural networks on resource-constrained edge devices, greatly reducing the number of network parameters without drastically compromising accuracy. A class of techniques proposed in the literature assigns an importance score to each parameter and prunes those of the least importance. However, most of these methods are based on generalised estimations of the importance of each parameter, ignoring the context of the specific task at hand. In this paper, we propose a task specific pruning approach, CSPrune, which is based on how efficiently a neuron or a convolutional filter is able to separate classes. Our axiomatic approach assigns an importance score based on how separable different classes are in the output activations or feature maps, preserving separation of classes which avoids reduction in classification accuracy. Additionally, most pruning algorithms prune individual connections or weights leading to a sparse network without taking into account whether the hardware the network is deployed on can take advantage of that sparsity or not. CSPrune prunes whole neurons or filters which results in a more structured pruned network whose sparsity can be more efficiently utilised by the hardware. We evaluate our pruning method against various benchmark datasets, both small and large, and network architectures and show that our approach outperforms comparable pruning techniques.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call