Abstract

SummaryGraphics Processing Units (GPUs) are used as accelerators for improving performance while executing highly data parallel applications. The GPUs are characterized by a number of Streaming Multiprocessors (SM) and a large number of cores within each SM. In addition to this, a hierarchy of memories with different latencies and sizes is present in the GPUs. The program execution in GPUs is thus dependent on a number of parameter values, both at compile time and runtime. To obtain the optimal performance with these GPU resources, a large parameter space is to be explored, and this leads to a number of unproductive program executions. To alleviate this difficulty, machine learning–based autotuning systems are proposed to predict the right configuration using a limited set of compile‐time parameters. In this paper, we propose a two‐stage machine learning–based autotuning framework using an expanded set of attributes. The important parameters such as block size, occupancy, eligible warps, and execution time are predicted. The mean relative error in prediction of different parameters ranges from of 16% to 6.5%. Dimensionality reduction for the features set reduces the features by up to 50% with further increase in prediction accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call