A Quantitative Evaluation of Contemporary GPU Simulation Methodology

Akshay Jain,Timothy G Rogers,Mahmoud Khairy

doi:10.1145/3292040.3219658

Abstract

Contemporary Graphics Processing Units (GPUs) are used to accelerate highly parallel compute workloads. For the last decade, researchers in academia and industry have used cycle-level GPU architecture simulators to evaluate future designs. This paper performs an in-depth analysis of commonly accepted GPU simulation methodology, examining the effect both the workload and the choice of instruction set architecture have on the accuracy of a widely-used simulation infrastructure, GPGPU-Sim. We analyze numerous aspects of the architecture, validating the simulation results against real hardware. Based on a characterized set of over 1700 GPU kernels, we demonstrate that while the relative accuracy of compute-intensive workloads is high, inaccuracies in modeling the memory system result in much higher error when memory performance is critical. We then perform a case study using a recently proposed GPU architecture modification, demonstrating that the cross-product of workload characteristics and instruction set architecture choice can have an affect on the predicted efficacy of the technique.

Full Text