Throughput optimization and resource allocation on GPUs under multi-application execution

Srinivasa Reddy Punyala,Arash Komaee,Theodoros Marinakis,Iraklis Anagnostopoulos

doi:10.23919/date.2018.8341982

Abstract

Platform heterogeneity prevails as a solution to the throughput and computational challenges imposed by parallel applications and technology scaling. Specifically, Graphics Processing Units (GPUs) are based on the Single Instruction Multiple Thread (SIMT) paradigm and they can offer tremendous speedup for parallel applications. However, GPUs were designed to execute a single application at a time. In case of simultaneous multi-application execution, due to the GPUs' massive multi-threading paradigm, applications compete against each other using destructively the shared resources (caches and memory controllers) resulting in significant throughput degradation. In this paper, a methodology for minimizing interference in shared resources and provide efficient concurrent execution of multiple applications on GPUs is presented. Particularly, the proposed methodology (i) performs application classification; (ii) analyzes the per-class interference; (iii) finds the best matching between classes; and (iv) employs an efficient resource allocation. Experimental results showed that the proposed approach increases the throughput of the system for two concurrent applications by an average of 36% compared to the default execution and 10% compared to an exahustive profile-based optimization technique.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Throughput optimization and resource allocation on GPUs under multi-application execution

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Unified on-chip memory allocation for SIMT architecture
Ari B Hayes ... Eddy Z Zhang
-
Ari B Hayes, et. al.Ari B Hayes ... Eddy Z Zhang
10 Jun 2014
10 Jun 2014

Fast 1-itemset frequency count using CUDA
Roger Luis Uy ... Nelson Marcos
-
Roger Luis Uy, et. al.Roger Luis Uy ... Nelson Marcos
01 Nov 2016
01 Nov 2016

Study on Transient Temperature Field Parallel Computing in Cooling Control Based on a GPU Fourier Method
Liang Wang ... Yi-Sheng Zhang
-
Liang Wang, et. al.Liang Wang ... Yi-Sheng Zhang
01 Dec 2010
01 Dec 2010

Nonlinear optimization with a massively parallel Evolution Strategy–Pattern Search algorithm on graphics hardware
Weihang Zhu
Applied Soft Computing Journal | VOL. 11
Weihang ZhuWeihang Zhu
08 Jun 2010
Applied Soft Computing Journal | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Throughput optimization and resource allocation on GPUs under multi-application execution

Abstract

Talk to us

Similar Papers