Abstract

AbstractConvolutional Neural Networks (CNNs) are one of the factors supporting the rapid development of artificial intelligent techniques. However, as the ability of the network increases, the size of the network becomes larger. Thus far, several works related to reduction of the network size have been tackled. In many cases, these approaches produce an unstructured network which prevents efficient parallel computation. To avoid this problem, we propose a novel structured sparse fully connected layer (FCL) in the CNNs. The aim of our proposed approach is reduction of the number of network parameters in the FCLs which occupy a large part of network parameters. Unlike the general FCLs used in the popular CNNs such as VGG‐16, the proposed approach reduces the connection between the last convolutional layer and the first FCL. In addition, we show an implementation for the proposed sparse FCLs on the GPU using cuBLAS. As a result for ILSVRC‐2012 dataset, the proposed approach achieves a 21.3 times compression with 0.68% top‐1 accuracy and 0.31% top‐5 accuracy decreases for VGG‐16. The implementation of the proposed FCLs achieves speed‐up factor 14.97 and 16.67 for forward and backward propagation compared to that for the noncompressed FCLs, respectively.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call