Abstract

This paper introduces an architectural technique that reduces energy of Tensor Cores in GPGPUs. Over the past few years, deep neural networks (DNNs) have become the compelling solution for many applications such as image classification, speech recognition, and natural language processing. Various hardware frameworks have been proposed to accelerate DNNs. In particular, Tensor Cores in NVIDIA GPGPUs offer significant speedup compared with previous GPGPU architectures. However, the great success comes at the cost of excessive energy. Value-based optimization techniques have been utilized to accelerate DNNs. In particular, several studies exploited sparse values to skip unnecessary computations. However, the majority of these studies focused on acceleration of DNNs rather than energy saving. In this work, we exploit power gating to reduce energy of Tensor Cores. We show that blindly applying power gating to multipliers results in significant performance loss due to timing overhead of power gating. In order to mitigate performance penalty of power gating, we propose sparsity-aware power gating (SPG) that monitors inputs of multipliers and turns them off only if inputs remain sparse for long intervals. We further improve SPG by introducing an adaptive technique that dynamically changes power gating policy based on frequency of changes in inputs of multipliers. Our experimental results show that our proposed technique can achieve 21% energy saving in Tensor Cores with negligible impact on performance while maintaining accuracy.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.