Power profiling of Cholesky and QR factorizations on distributed memory systems

George Bosilca,Hatem Ltaief,Jack Dongarra

doi:10.1007/s00450-012-0224-2

Abstract

This paper presents the power profile of two high performance dense linear algebra libraries on distributed memory systems, ScaLAPACK and DPLASMA. From the algorithmic perspective, their methodologies are opposite. The former is based on block algorithms and relies on multithreaded BLAS and a two-dimensional block cyclic data distribution to achieve high parallel performance. The latter is based on tile algorithms running on top of a tile data layout and uses fine-grained task parallelism combined with a dynamic distributed scheduler (DAGuE) to leverage distributed memory systems. We present performance results (Gflop/s) as well as the power profile (Watts) of two common dense factorizations needed to solve linear systems of equations, namely Cholesky and QR. The reported numbers show that DPLASMA surpasses ScaLAPACK not only in terms of performance (up to 2X speedup) but also in terms of energy efficiency (up to 62 %).

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Power profiling of Cholesky and QR factorizations on distributed memory systems

Abstract

Talk to us

Similar Papers

More From: Computer Science - Research and Development

Lead the way for us

Journal: Computer Science - Research and Development	Publication Date: Aug 30, 2012
Citations: 13

Similar Papers

Profiling high performance dense linear algebra algorithms on multicore architectures for power and energy efficiency
Hatem Ltaief ... Jack Dongarra
Computer Science - Research and Development | VOL. 27
Hatem Ltaief, et. al.Hatem Ltaief ... Jack Dongarra
31 Aug 2011
Computer Science - Research and Development | VOL. 27

ExaGeoStat: A High Performance Unified Software for Geostatistics on Manycore Systems
Sameh Abdulah ... Ying Sun
IEEE Transactions on Parallel and Distributed Systems | VOL. 29
Sameh Abdulah, et. al.Sameh Abdulah ... Ying Sun
01 Dec 2018
IEEE Transactions on Parallel and Distributed Systems | VOL. 29

New Generalized Data Structures for Matrices Lead to a Variety of High Performance Dense Linear Algebra Algorithms
Fred G Gustavson
-
Fred G GustavsonFred G Gustavson
01 Jan 2006
01 Jan 2006

High performance dense linear algebra on a spatially distributed processor
Jeffrey R Diamond ... Robert Van De Geijn
-
Jeffrey R Diamond, et. al.Jeffrey R Diamond ... Robert Van De Geijn
20 Feb 2008
20 Feb 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Power profiling of Cholesky and QR factorizations on distributed memory systems

Abstract

Talk to us

Similar Papers

More From: Computer Science - Research and Development