Abstract

Deep Learning allows us to build powerful models to solve problems like image classification, time series prediction, natural language processing, etc. This is achieved at the cost of huge amounts of storage and processing requirements which are sometimes not possible in machines with limited resources. In this paper, we compare different methods which tackle this problem with network pruning. Selected few pruning methodologies from the deep learning literature were implemented to display their results. Modern neural architectures have a combination of different layers like convolutional layers, pooling layers, dense layers, etc. We compare pruning techniques for dense layers (such as unit/neuron pruning, and weight Pruning), and convolutional layers as well (using L1 norm, taylor expansion of loss to determine importance of convolutional filters, and Variable Importance in Projection using Partial Least Squares) for the image classification task. This study aims to ease the overhead in terms of optimization of the model for academic, as well as commercial, use of deep neural networks.

Highlights

  • Deep learning has been on rise for quite a while and has proved itself to be state-of-the-art for many real-life problems related to texts, images, audio, video etc

  • Variable Importance in Projection (VIP) PLS is extremely computationally expensive compared to other techniques

  • PLS maintains higher accuracy than the original till 50% pruning, whereas the other methods never gain a higher accuracy than the one they started with

Read more

Summary

Introduction

Deep learning has been on rise for quite a while and has proved itself to be state-of-the-art for many real-life problems related to texts, images, audio, video etc. Researchers have achieved unexpectedly better results in various tasks such as image classification, image segmentation, object classification and detection, text mining, recommendation engines, speech recognition, prediction modelling and many others by employing deep learning with their approaches. The performance improves as we train deeper and deeper networks. Reduced the top-1 and top-5 error by a significant margin than the previous state-of-the-art architectures such as AlexNet[14] (8 layers) and LeNet[15] (6 layers deep). Since every moon has a dark side, this performance comes at the price of computation as well as storage cost.

Objectives
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.