Abstract

State-of-the-art deep neural network plays an increasingly important role in artificial intelligence, while the huge number of parameters in networks brings high memory cost and computational complexity. To solve this problem, filter pruning is widely used for neural network compression and acceleration. However, existing algorithms focus mainly on pruning single model, and few results are available to multi-task pruning that is capable of pruning multi-model and promoting the learning performance. By utilizing the filter sharing technique, this paper aimed to establish a multi-task pruning framework for simultaneously pruning and merging filters in multi-task networks. An optimization problem of selecting the important filters is solved by developing a many-objective optimization algorithm where three criteria are adopted as objectives for the many-objective optimization problem. With the purpose of keeping the network structure, an index matrix is introduced to regulate the information sharing during multi-task training. The proposed multi-task pruning algorithm is quite flexible that can be performed with either adaptive or pre-specified pruning rates. Extensive experiments are performed to verify the applicability and superiority of the proposed method on both single-task and multi-task pruning.

Highlights

  • With the success on applications in ImageNet challenge, the deep neural networks (DNNs) have been extensively utilized in a wide variety of applications [1,2,3] and showed superiorities over other approaches

  • A multi-task pruning algorithm Multi-task Filter Index Sharing (MFIS) has been provided by virtue of the filter index sharing approach

  • A filter sharing strategy has been proposed capable of automatically sharing filters of both inner networks and external networks

Read more

Summary

Introduction

With the success on applications in ImageNet challenge, the deep neural networks (DNNs) have been extensively utilized in a wide variety of applications [1,2,3] and showed superiorities over other approaches. As the structure becomes deeper and larger, the number of network parameters would increase considerably, resulting in a dramatic escalation in computing cost during the utilization of DNNs. DNN with relatively low computing cost yet high accuracy is urgently needed nowadays, which gives rise to the development of network pruning technique to simplify the structure and reduce the parameters. Weight-pruning methods remove connections in the filters or across different layers, thereby reducing the cost of memory cost. The main weakness of weight pruning is the unstructural operation manner. The unstructured sparsity of the filters makes weight pruning hard to deploy existing basic linear algebra subprograms (BLASs) libraries. Weight pruning is not effective in reducing computational cost. Filter pruning allows models to be structured with sparsity, reducing the storage usage on devices and achieving practical acceleration

Objectives
Methods
Findings
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call