Abstract
Modern General Purpose Graphic Processing Units (GPGPU) offer high throughput for parallel applications with their hundreds of integrated cores. However, there are applications that experience performance saturation and even degradation with increasing number of cores. At present the scheduler in the GPU hardware allocates all the available resources to maximize their utilization. We observed that applications have preference towards specific set of resources. The utilization of other redundant resources can reduce the throughput of the applications. To overcome this problem, in this paper we first classify the applications into two types; type-I that dominantly require processing cores and type-II that rely on the performance of the memory-system. We propose an Application aware Scalable Architecture (ApSA) for GPGPU based on classified applications which performs run-time tailoring of the GPU resources to present an optimal set of resources to the running application. The results are analyzed and compared in terms of instructions per cycle, bandwidth utilization and branch divergence. We found that if the application is identified to be of type-I with the proposed technique the average profiling overhead is 1.6%. Type-II applications experience average profiling overhead of 1.15%. The average power saved by clock-gating redundant resources in the case of type-II applications is 20.08%.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.