Demystifying the Placement Policies of the NVIDIA GPU Thread Block Scheduler for Concurrent Kernels

Guin Gilman,Robert J Walls,Samuel S Ogden,Tian Guo

doi:10.1145/3453953.3453972

Demystifying the Placement Policies of the NVIDIA GPU Thread Block Scheduler for Concurrent Kernels

Guin Gilman, Robert J Walls + Show 2 more

https://doi.org/10.1145/3453953.3453972

Copy DOI

Export

Save

Cite

Journal: ACM SIGMETRICS Performance Evaluation Review	Publication Date: Mar 5, 2021
Citations: 14

Affiliation: Worcester Polytechnic Institute

#Streaming Multiprocessors #Increase In Execution Time #Local Resource Availability #NVIDIA's Pascal #Thread Block #Scheduler's Behavior #Scheduling Policy #Scheduling Algorithms #NVIDIA Volta #Performance Degradation

Abstract
Full-Text
Similar Papers

Abstract

Listen

In this work, we empirically derive the scheduler's behavior under concurrent workloads for NVIDIA's Pascal, Volta, and Turing microarchitectures. In contrast to past studies that suggest the scheduler uses a round-robin policy to assign thread blocks to streaming multiprocessors (SMs), we instead find that the scheduler chooses the next SM based on the SM's local resource availability. We show how this scheduling policy can lead to significant, and seemingly counter-intuitive, performance degradation; for example, a decrease of one thread per block resulted in a 3.58X increase in execution time for one kernel in our experiments. We hope that our work will be useful for improving the accuracy of GPU simulators and aid in the development of novel scheduling algorithms.

Full Text

Published Version

Check institute access

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Similar Papers

Paper Title

Journal

Date

Author

View more papers

More From: ACM SIGMETRICS Performance Evaluation Review

Paper Title

Journal

Date

Author

View more papers

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.

R Discovery Prime

Demystifying the Placement Policies of the NVIDIA GPU Thread Block Scheduler for Concurrent Kernels