Abstract

General-purpose computing using graphics processing units (GPGPUs) is an attractive option for acceleration of applications with massively data-parallel tasks. While performance of modern GPGPUs is increasing rapidly, the power consumption of these devices is becoming a major concern. In particular, execution units and register file are among the top three most power-hungry components in GPGPUs. In this work, we exploit trivial instructions to reduce power consumption in GPGPUs. Trivial instructions are those instructions that do not need computations, i.e., multiplication by one. We found that, during the course of a program's execution, a GPGPU executes many trivial instructions. Execution of these instructions wastes power unnecessarily. In this work, we propose trivial bypassing which skips execution of trivial instructions and avoids unnecessary allocation of resources for trivial instructions. By power gating execution units and skipping trivial computing, trivial bypassing reduces both static and dynamic power. Also, trivial bypassing reduces dynamic energy of register file by avoiding access to register file for source and/or destination operands of trivial instructions. While trivial bypassing reduces energy of GPGPUs, it has detrimental impact on performance as a power-gated execution unit requires several cycles to resume its normal operation. Conventional warp schedulers are oblivious to the status of execution units. We propose a new warp scheduler that prioritizes warps based on availability of execution units. We also propose a set of new power management techniques to reduce performance penalty of power gating, further. To increase energy saving of trivial bypassing, we also propose approximating operands of instructions. We offer a set of new techniques to approximate both integer and floating-point instructions and increase the pool of trivial instructions. Our evaluations using a diverse set of benchmarks reveal that our proposed techniques are able to reduce energy of execution units by 11.2% and dynamic energy of register file by 12.2% with minimal performance and quality degradation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call