Abstract

Sparsity, a widely recognized path to curbing the computational needs of deep neural networks (DNNs), still suffers a number of roadblocks in practice, despite a decade of intensive research on sparse neural networks. The extant structured sparsity patterns often fail to attain significant model compression, while the hardware challenges posed by unstructured sparsity are yet to be fully overcome. As algorithmic and hardware innovations individually deliver limited benefits, a synergistic approach is necessary to unleash the potential of sparse DNNs. This work proposes a tightly integrated design methodology for the sparsity patterns and associated hardware platforms to reach the highest model compression goals while simultaneously facilitating efficient hardware processing. We demonstrate that novel complementary sparsity patterns can offer utmost expressiveness levels with inherent hardware exploitable regularity. Our novel dynamic training method converts the expressiveness of such sparsity configurations into highly accurate and compact sparse neural networks. Complementary sparsity is represented in a dense format, and when synergistically coupled with minimal yet strategic hardware modifications, can be processed in close concordance with the conventional dataflow of the dense matrix operations. We thus demonstrate that there is ample room for innovation beyond conventional techniques and immense practical potential for sparse neural networks through the synergistic design of sparsity patterns and hardware architectures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call