New Model-Based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters

Alexey Lastovetsky,Ravi Reddy Manumachu

doi:10.1109/tpds.2016.2608824

Abstract

Modern homogeneous parallel platforms are composed of tightly integrated multicore CPUs. This tight integration has resulted in the cores contending for various shared on-chip resources such as Last Level Cache (LLC) and interconnect, leading to resource contention and non-uniform memory access (NUMA). Due to these newly introduced complexities, the performance and energy profiles of real-life scientific applications on these platforms are not smooth and may deviate significantly from the shapes that allowed traditional and state-of-the-art load balancing algorithms to minimize their computation time. In this paper, we propose new model-based methods and algorithms for minimization of time and energy of computations for the most general shapes of performance and energy profiles of data parallel applications observed on the modern homogeneous multicore clusters. We formulate the performance and energy optimization problems and present efficient algorithms of complexity O(p <sup xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">2</sup> ) solving these problems where p is the number of processors. It is important to note that the globally optimal solutions found by these algorithms may not load-balance the application. We experimentally study the efficiency and scalability of our algorithms for two data parallel applications, matrix multiplication and fast Fourier transform, on a modern multicore CPU and clusters of such CPUs. We also demonstrate the optimality of solutions determined by our algorithms.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

New Model-Based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems

Lead the way for us

Journal: IEEE Transactions on Parallel and Distributed Systems	Publication Date: Apr 1, 2017
Citations: 53

Similar Papers

Bi-Objective Optimization of Data-Parallel Applications on Homogeneous Multicore Clusters for Performance and Energy
Ravindranath Reddy Manumachu ... Alexey Lastovetsky
IEEE Transactions on Computers | VOL. 67
Ravindranath Reddy Manumachu, et. al.Ravindranath Reddy Manumachu ... Alexey Lastovetsky
01 Feb 2018
IEEE Transactions on Computers | VOL. 67

A Novel Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous HPC Platforms
Hamidreza Khaleghzadeh ... Alexey Lastovetsky
IEEE Transactions on Parallel and Distributed Systems | VOL. 29
Hamidreza Khaleghzadeh, et. al.Hamidreza Khaleghzadeh ... Alexey Lastovetsky
01 Oct 2018
IEEE Transactions on Parallel and Distributed Systems | VOL. 29

A Hierarchical Data-Partitioning Algorithm for Performance Optimization of Data-Parallel Applications on Heterogeneous Multi-Accelerator NUMA Nodes
Hamidreza Khaleghzadeh ... Ravi Reddy Manumachu
IEEE Access | VOL. 8
Hamidreza Khaleghzadeh, et. al.Hamidreza Khaleghzadeh ... Ravi Reddy Manumachu
26 Dec 2019
IEEE Access | VOL. 8

Design of self‐adaptable data parallel applications on multicore clusters automatically optimized for performance and energy through load distribution
Ravi Reddy Manumachu ... Alexey L Lastovetsky
Concurrency and Computation: Practice and Experience | VOL. 31
Ravi Reddy Manumachu, et. al.Ravi Reddy Manumachu ... Alexey L Lastovetsky
30 Aug 2018
Concurrency and Computation: Practice and Experience | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

New Model-Based Methods and Algorithms for Performance and Energy Optimization of Data Parallel Applications on Homogeneous Multicore Clusters

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Parallel and Distributed Systems