Accelerating Convolutional Neural Networks for Mobile Applications

Peisong Wang,Jian Cheng

doi:10.1145/2964284.2967280

Abstract

Convolutional neural networks (CNNs) have achieved remarkable performance in a wide range of computer vision tasks, typically at the cost of massive computational complexity. The low speed of these networks may hinder real-time applications especially when computational resources are limited. In this paper, an efficient and effective approach is proposed to accelerate the test-phase computation of CNNs based on low-rank and group sparse tensor decomposition. Specifically, for each convolutional layer, the kernel tensor is decomposed into the sum of a small number of low multilinear rank tensors. Then we replace the original kernel tensors in all layers with the approximate tensors and fine-tune the whole net with respect to the final classification task using standard backpropagation. Comprehensive experiments on ILSVRC-12 demonstrate significant reduction in computational complexity, at the cost of negligible loss in accuracy. For the widely used VGG-16 model, our approach obtains a 6.6$\times$ speed-up on PC and 5.91$\times$ speed-up on mobile device of the whole network with less than 1\% increase on top-5 error.

Full Text