Abstract
AbstractConvolutional Neural Networks (CNNs) are one of the factors supporting the rapid development of artificial intelligent techniques. However, as the ability of the network increases, the size of the network becomes larger. Thus far, several works related to reduction of the network size have been tackled. In many cases, these approaches produce an unstructured network which prevents efficient parallel computation. To avoid this problem, we propose a novel structured sparse fully connected layer (FCL) in the CNNs. The aim of our proposed approach is reduction of the number of network parameters in the FCLs which occupy a large part of network parameters. Unlike the general FCLs used in the popular CNNs such as VGG‐16, the proposed approach reduces the connection between the last convolutional layer and the first FCL. In addition, we show an implementation for the proposed sparse FCLs on the GPU using cuBLAS. As a result for ILSVRC‐2012 dataset, the proposed approach achieves a 21.3 times compression with 0.68% top‐1 accuracy and 0.31% top‐5 accuracy decreases for VGG‐16. The implementation of the proposed FCLs achieves speed‐up factor 14.97 and 16.67 for forward and backward propagation compared to that for the noncompressed FCLs, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.