Coordinated Energy Management in Heterogeneous Processors

Indrani Paul,Manish Arora,Vignesh Ravi,Sudhakar Yalamanchili,Srilatha Manne

doi:10.1155/2014/210762

Abstract

This paper examines energy management in a heterogeneous processor consisting of an integrated CPU–GPU for high-performance computing (HPC) applications. Energy management for HPC applications is challenged by their uncompromising performance requirements and complicated by the need for coordinating energy management across distinct core types – a new and less understood problem. We examine the intra-node CPU–GPU frequency sensitivity of HPC applications on tightly coupled CPU–GPU architectures as the first step in understanding power and performance optimization for a heterogeneous multi-node HPC system. The insights from this analysis form the basis of a coordinated energy management scheme, called DynaCo, for integrated CPU–GPU architectures. We implement DynaCo on a modern heterogeneous processor and compare its performance to a state-of-the-art power- and performance-management algorithm. DynaCo improves measured average energy-delay squared (ED2) product by up to 30% with less than 2% average performance loss across several exascale and other HPC workloads.

Highlights

Efficient energy management is central to the effective operation of modern processors in platforms from mobile to data centers and high-performance computing (HPC) machines
We evaluated a sub-set of benchmarks (S3D, Sort, Stencil2D, Breadthfirst Search (BFS)) from the Scalable Heterogeneous Computing (SHOC) benchmark suite [13] that represents a large portion of scientific code found in HPC applications
This paper proposed and implemented a set of techniques to improve the energy efficiency of integrated CPU–graphics processing units (GPUs) processors

Summary

Introduction

Efficient energy management is central to the effective operation of modern processors in platforms from mobile to data centers and high-performance computing (HPC) machines. Driven in part by demand for energy efficiency, we have seen the emergence of such processors with attached graphics processing units (GPUs) acting as accelerators. It contains two out-of-order dual-core CPU compute units (CUs, referred to as Piledriver modules) and a GPU. The GPU consists of 384 AMD RadeonTM cores, each capable of one single-precision fused multiply-add computation (FMAC) operation per cycle (the methodology and techniques in this paper are applicable to processors that support double-precision). More details on the AMD A-Series processor can be found in [32]

Objectives

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Scientific Programming	Publication Date: Jan 1, 2014
Citations: 5	License type: CC BY 3.0

R Discovery Prime

R Discovery Prime

Coordinated Energy Management in Heterogeneous Processors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Programming

Lead the way for us

Similar Papers

Coordinated Energy Management in Heterogeneous Processors
...
Scientific Programming | VOL. 22
, et. al. ...
01 Jan 2014
Scientific Programming | VOL. 22

Coordinated energy management in heterogeneous processors
Indrani Paul ... Srilatha Manne
-
Indrani Paul, et. al.Indrani Paul ... Srilatha Manne
17 Nov 2013
17 Nov 2013

HPC Process and Optimal Network Device Affinitization
Ravindra Babu Ganapathi ... Aravind Gopalakrishnan
IEEE Transactions on Multi-Scale Computing Systems | VOL. 4
Ravindra Babu Ganapathi, et. al.Ravindra Babu Ganapathi ... Aravind Gopalakrishnan
01 Oct 2018
IEEE Transactions on Multi-Scale Computing Systems | VOL. 4

Improving HPC Application Performance in Public Cloud
Rashid Hassani ... Peter Luksch
IERI Procedia | VOL. 10
Rashid Hassani, et. al.Rashid Hassani ... Peter Luksch
01 Jan 2014
IERI Procedia | VOL. 10

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Coordinated Energy Management in Heterogeneous Processors

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Scientific Programming