Architectural support for address translation on GPUs

Bharath Pichai,Abhishek Bhattacharjee,Lisa Hsu

doi:10.1145/2654822.2541942

Abstract

The proliferation of heterogeneous compute platforms, of which CPU/GPU is a prevalent example, necessitates a manageable programming model to ensure widespread adoption. A key component of this is a shared unified address space between the heterogeneous units to obtain the programmability benefits of virtual memory. To this end, we are the first to explore GPU Memory Management Units(MMUs) consisting of Translation Lookaside Buffers (TLBs) and page table walkers (PTWs) for address translation in unified heterogeneous systems. We show the performance challenges posed by GPU warp schedulers on TLBs accessed in parallel with L1 caches, which provide many well-known programmability benefits. In response, we propose modest TLB and PTW augmentations that recover most of the performance lost by introducing L1 parallel TLB access. We also show that a little TLB-awareness can make other GPU performance enhancements (e.g., cache-conscious warp scheduling and dynamic warp formation on branch divergence) feasible in the face of cache-parallel address translation, bringing overheads in the range deemed acceptable for CPUs (10-15\% of runtime). We presume this initial design leaves room for improvement but anticipate that our bigger insight, that a little TLB-awareness goes a long way in GPUs, will spur further work in this fruitful area.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Architectural support for address translation on GPUs

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News

Lead the way for us

Journal: ACM SIGARCH Computer Architecture News	Publication Date: Feb 24, 2014
Citations: 13

Similar Papers

Architectural support for address translation on GPUs
Bharath Pichai ... Abhishek Bhattacharjee
ACM SIGPLAN Notices | VOL. 49
Bharath Pichai, et. al.Bharath Pichai ... Abhishek Bhattacharjee
24 Feb 2014
ACM SIGPLAN Notices | VOL. 49

Architectural support for address translation on GPUs
Bharath Pichai ... Abhishek Bhattacharjee
-
Bharath Pichai, et. al.Bharath Pichai ... Abhishek Bhattacharjee
24 Feb 2014
24 Feb 2014

Efficient Synonym Filtering and Scalable Delayed Translation for Hybrid Virtual Caching
Chang Hyun Park ... Jaehyuk Huh
-
Chang Hyun Park, et. al.Chang Hyun Park ... Jaehyuk Huh
01 Jun 2016
01 Jun 2016

Efficient synonym filtering and scalable delayed translation for hybrid virtual caching
Chang Hyun Park ... Taekyung Heo
ACM SIGARCH Computer Architecture News | VOL. 44
Chang Hyun Park, et. al.Chang Hyun Park ... Taekyung Heo
18 Jun 2016
ACM SIGARCH Computer Architecture News | VOL. 44

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Architectural support for address translation on GPUs

Abstract

Talk to us

Similar Papers

More From: ACM SIGARCH Computer Architecture News