Abstract

The huge size of deep neural networks makes it difficult to deploy on the embedded platforms with limited computation resources directly. In this article, we propose a novel trimming approach to determine the redundant parameters of the trained deep neural network in a layer-wise manner to produce a compact neural network. This is achieved by minimizing a nonconvex sparsity-inducing term of the network parameters while maintaining the response close to the original one. We present a proximal iteratively reweighted method to resolve the resulting nonconvex model, which approximates the nonconvex objective by a weighted l1 norm of the network parameters. Moreover, to alleviate the computational burden, we develop a novel termination criterion during the subproblem solution, significantly reducing the total pruning time. Global convergence analysis and a worst-case O(1/k) ergodic convergence rate for our proposed algorithm is established. Numerical experiments demonstrate the proposed approach is efficient and reliable.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call