LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System

Jakub Kurzak,Jack Dongarra,Pitior Luszczek,Mathieu Faverge

doi:10.2172/1173291

LU Factorization with Partial Pivoting for a Multi-CPU, Multi-GPU Shared Memory System

Jakub Kurzak, Jack Dongarra + Show 2 more

Open Access

https://doi.org/10.2172/1173291

Copy DOI

Publication Date: Mar 1, 2012
Citations: 9	License type: other-oa

Affiliation: University of Tennessee at Knoxville, Oak Ridge National Laboratory, University of Manchester

#NVIDIA Fermi GPUs #LU Factorization + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

LU factorization with partial pivoting is a canonical numerical procedure and the main component of the High Performance LINPACK benchmark. This article presents an implementation of the algorithm for a hybrid, shared memory, system with standard CPU cores and GPU accelerators. Performance in excess of one TeraFLOPS is achieved using four AMD Magny Cours CPUs and four NVIDIA Fermi GPUs.

Full Text