Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-Tuning

Tianyi David Han,Tarek S Abdelrahman

doi:10.1109/ipdpsw.2017.108

Abstract

We explore the use of synthetic benchmarks for the training phase of machine-learning-based automatic performance tuning. We focus on the problem of predicting if the use of local memory on a GPU is beneficial for caching a single target array in a GPU kernel. We show that the use of only 13 real benchmarks leads to poor prediction accuracy (about to 58%) of the 13 leave-one-out models trained using these benchmarks, even when the model features are sufficiently comprehensive. We define a metric, called the average vicinity density to measure the quality of a training set. We then use it to demonstrate that the poor accuracy of the models built with the real benchmarks is indeed because of the limited size and coverage of the training set. In contrast, the use of 90K properly generated set of synthetic benchmarks leads to significantly better accuracies, up to 87%. These results validate our approach of using synthetic benchmarks for training machine learning models. We describe a synthetic benchmark template for the local memory optimization. We then present two approaches to using this template and a seed set of real benchmarks to generate a large number of synthetic benchmark. We also explore the impact of the number of synthetic benchmarks used in training.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-Tuning

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Grover: Looking for Performance Improvement by Disabling Local Memory Usage in OpenCL Kernels
Jianbin Fang ... Pekka Jaaskelainen
-
Jianbin Fang, et. al.Jianbin Fang ... Pekka Jaaskelainen
01 Sep 2014
01 Sep 2014

Optimizing and Auto-tuning Belief Propagation on the GPU
Scott Grauer-Gray ... John Cavazos
-
Scott Grauer-Gray, et. al.Scott Grauer-Gray ... John Cavazos
01 Jan 2010
01 Jan 2010

Automatic Tuning for Parallel FFTs
Daisuke Takahashi
-
Daisuke TakahashiDaisuke Takahashi
13 Aug 2010
13 Aug 2010

CoreVA-MPSoC: A Many-Core Architecture with Tightly Coupled Shared and Local Data Memories
Johannes Ax ... Julian Daberkow
IEEE Transactions on Parallel and Distributed Systems | VOL. 29
Johannes Ax, et. al.Johannes Ax ... Julian Daberkow
01 May 2018
IEEE Transactions on Parallel and Distributed Systems | VOL. 29

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Use of Synthetic Benchmarks for Machine-Learning-Based Performance Auto-Tuning

Abstract

Talk to us

Similar Papers