Parallel Data Partitioning Algorithms for Optimization of Data-Parallel Applications on Modern Extreme-Scale Multicore Platforms for Performance and Energy

Ravi Reddy Manumachu,Alexey Lastovetsky

doi:10.1109/access.2018.2879228

Ravi Reddy Manumachu, Alexey Lastovetsky

Open Access

https://doi.org/10.1109/access.2018.2879228

Copy DOI

Journal: IEEE Access	Publication Date: Jan 1, 2018
Citations: 43	License type: cc-by-nc-nd

Affiliation: University College Dublin

Abstract

Data partitioning algorithms aiming to minimize the execution time and the energy of computations in self-adaptable data-parallel applications on modern extreme-scale multicore platforms must address two critical challenges. First, they must take into account the new complexities inherent in these platforms such as severe resource contention and non-uniform memory access. Second, they must have low practical runtime and memory costs. The sequential data partitioning algorithms addressing the first challenge have a theoretical time complexity of O( $m * m * p * p$ ) where $m$ is the number of points in the discrete speed/energy function and $p$ is the number of available processors. They, however, exhibit high practical runtime cost and excessive memory footprint, therefore, rendering them impracticable for employment in self-adaptable applications executing on extreme-scale multicore platforms. We present, in this paper, the parallel data partitioning algorithms that address both the challenges. They take as input the functional models of performance and energy consumption against problem size and output workload distributions, which are globally optimal solutions. They have a low time complexity of O( $m * m * p$ ) thereby providing a linear speedup of O( $p$ ) and low memory complexity of O( $n$ ) where $n$ is the workload size expressed as a multiple of granularity. They employ dynamic programming approach, which also facilitates the easier integration of performance and energy models of communications. We experimentally study the practical cost of application of our algorithms in two data-parallel applications, matrix multiplication and fast Fourier transform, on a cluster in Grid’5000 platform. We demonstrate that their practical runtime and memory costs are low making them ideal for employment in self-adaptable applications. We also show that the parallel algorithms exhibit tremendous speedups over the sequential algorithms. Finally, using theoretical analysis for a forecast exascale platform, we demonstrate that the parallel algorithms have negligible execution times compared to the matrix multiplication application executing on the platform.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Parallel Data Partitioning Algorithms for Optimization of Data-Parallel Applications on Modern Extreme-Scale Multicore Platforms for Performance and Energy

Abstract

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Design of self‐adaptable data parallel applications on multicore clusters automatically optimized for performance and energy through load distribution
Ravi Reddy Manumachu ... Alexey L Lastovetsky
Concurrency and Computation: Practice and Experience | VOL. 31
Ravi Reddy Manumachu, et. al.Ravi Reddy Manumachu ... Alexey L Lastovetsky
30 Aug 2018
Concurrency and Computation: Practice and Experience | VOL. 31

An Efficient and Scalable Algorithmic Method for Generating Large-Scale Random Graphs
Maksudul Alam ... Madhav Marathe
-
Maksudul Alam, et. al.Maksudul Alam ... Madhav Marathe
01 Nov 2016
01 Nov 2016

An efficient and scalable algorithmic method for generating large: scale random graphs
...
-
, et. al. ...
13 Nov 2016
13 Nov 2016

Bi-Objective Optimization of Data-Parallel Applications on Homogeneous Multicore Clusters for Performance and Energy
Ravindranath Reddy Manumachu ... Alexey Lastovetsky
IEEE Transactions on Computers | VOL. 67
Ravindranath Reddy Manumachu, et. al.Ravindranath Reddy Manumachu ... Alexey Lastovetsky
01 Feb 2018
IEEE Transactions on Computers | VOL. 67

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Parallel Data Partitioning Algorithms for Optimization of Data-Parallel Applications on Modern Extreme-Scale Multicore Platforms for Performance and Energy

Abstract

Talk to us

Similar Papers

More From: IEEE Access