Abstract

Enlightened by the fact that deep-neural-networks (DNNs) are typically highly over-parameterized, weight-pruning-based sparse training (ST) becomes a practical method to reduce training computation and compress models. However, the previous pruning algorithms are either with a coarse-grained pattern or a fine-grained pattern. They lead to a limited pruning ratio or a drastically irregular sparsity distribution, which is computation-intensive or logic-complex for hardware implementation. Meanwhile, the current DNN processors focus on sparse inference but cannot support emerging ST techniques. This paper proposes a co-design approach where the algorithm is adapted to suit the hardware constraints and the hardware exploit the algorithm property to accelerate sparse training. We first present a novel pruning algorithm, hybrid weight pruning, including channel-wise and line-wise pruning. It reaches a considerable pruning ratio while maintaining the hardware friendly property. Then we design a hardware architecture, Hybrid Pruning Processing Unit, HPPU, to accelerate the proposed algorithm. It develops a 2-level active data selector and a sparse convolution engine, which maximize hardware utilization when handling the hybrid sparsity patterns during training. We evaluate HPPU by synthesizing it with 28nm CMOS technology. HPPU achieves 50.1% higher pruning ratio than coarse-grained pruning and 1.53× higher energy-efficiency than fine-grained pruning. The peak energy-efficiency of HPPU is 126.04TFLOPs/W, outperforming state-of-the-art trainable processor GANPU 1.67×. When training a ResNet18 model, HPPU consumes 3.72× less energy and offers 4.69× speedup, and maintains unpruned accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.