Abstract

As a cost-effective compute device, Graphic Processing Unit (GPU) has been widely embraced in the field of high performance computing. GPU is characterized by its massive thread-level parallelism and high memory bandwidth. Although GPU has exhibited tremendous potential, recent GPU architecture researches mainly focus on GPU compute units and full system exploration is rare due to the lack of accurate simulators that can reveal hardware organization of both GPU compute units and its memory system. In order to fill this void, we build a GPU simulator called VxGPUSim that can support the simulation with detailed performance, timing and power consumption statistics. Our experimental evaluation demonstrates that VxGPUSim can faithfully reveal the internal execution details of GPU global memory of various memory configurations. It can enable further research on the design of GPU global memory for performance and energy tradeoffs.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call