Abstract
Modern graphics processing units (GPUs) with massive parallel architecture can boost the performance of both graphics and general-purpose applications. With the support of new programming tools, GPUs have become one of the most attractive platforms in the exploitation of the high thread-level parallelism. In the recent GPUs, hierarchical cache memories have been employed to support irregular memory-access patterns. However, the L1 data cache exhibits a poor efficiency in GPUs, and this is mainly due to the cache contention and the resource congestion. This paper shows that the L1 data cache does not always positively impact applications in terms of the performance; in fact, many applications are even slowed down due to the use of the L1 data cache. In this paper, a novel cache bypassing mechanism (CARB) is proposed to increase the efficiency of the GPU cache management and to improve the GPU performance. The CARB mechanism exploits the characteristics of the currently executed applications to estimate the performance impact of the L1 data cache on the GPU, and it then allows memory requests to bypass the cache in discrete phases during the execution time. The bypassing decision is determined adaptively at runtime. Experiment results show that the CARB mechanism achieves an average speedup of 22% for a wide range of GPGPU applications.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.