Improving the Performance and Energy Efficiency of GPGPU Computing through Integrated Adaptive Cache Management

Kyu Yeun Kim,Woongki Baek,Jinsu Park

doi:10.1109/tpds.2018.2868658

Abstract

Hardware caches are widely employed in GPGPUs to achieve higher performance and energy efficiency. Incorporating hardware caches in GPGPUs, however, does not immediately guarantee enhanced performance and energy efficiency due to high cache contention and thrashing. To address the inefficiency of GPGPU caches, various adaptive techniques (e.g., warp limiting) have been proposed. However, relatively little work has been done in the context of creating an architectural framework that tightly integrates adaptive cache management techniques and investigating their effectiveness and interaction. To bridge this gap, we propose IACM, integrated adaptive cache management for high-performance and energy-efficient GPGPU computing. IACM integrates the state-of-the-art adaptive cache management techniques (i.e., cache indexing, bypassing, and warp limiting) in a unified architectural framework. Our quantitative evaluation demonstrates that IACM significantly improves the performance and energy efficiency of various GPGPU workloads over the baseline architecture (i.e., 98.1 and 61.9 percent on average, respectively), achieves considerably higher performance than the state-of-the-art technique (i.e., 361.4 percent at maximum and 7.7 percent on average), and delivers significant performance and energy-efficiency gains over the baseline GPGPU architecture enhanced with advanced architectural technologies.

Full Text