Abstract

Deep learning has gained a lot of successes in various areas, including computer vision, natural language process, and robot control. Convolution neural network (CNN) is the most commonly used model in deep neural networks. Despite their effectiveness on feature abstraction, CNNs need powerful computation even in the inference stage, which becomes a major obstacle in their deployment in embedded and mobile devices. In order to solve this problem, we 1) propose to make decomposition on convolution layers and full connected layers in CNNs with naive semi-discrete matrix decomposition (SDD), which achieves the low-rank decomposition and parameters sparse at the same time; and 2) we propose a layer-merging scheme which merges two out of all the three result matrices, which can avoid the explode of the intermediate data come with the naive semi-discrete matrix decomposition; 3) we propose a progressive training strategy to speed up the converging. We implement this optimized method in image classification and object detection networks. Under the loss of network accuracy by 1%, we achieve significant running time and model size reduction. The full-connected layer of the LeNet network achieves $7\times $ speedup in the inference stage. In the Faster-Rcnn, the weight parameters are reduced by the factor of $5.85\times $ , and it can have a speedup by the factor of $1.75\times $ .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.