Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs

Ahmad Abdelfattah,Stan Tomov,Jack Dongarra

doi:10.1007/978-3-030-50417-5_18

Abstract

Half-precision computation refers to performing floating-point operations in a 16-bit format. While half-precision has been driven largely by machine learning applications, recent algorithmic advances in numerical linear algebra have discovered beneficial use cases for half precision in accelerating the solution of linear systems of equations at higher precisions. In this paper, we present a high-performance, mixed-precision linear solver (Ax=b) for symmetric positive definite systems in double-precision using graphics processing units (GPUs). The solver is based on a mixed-precision Cholesky factorization that utilizes the high-performance tensor core units in CUDA-enabled GPUs. Since the Cholesky factors are affected by the low precision, an iterative refinement (IR) solver is required to recover the solution back to double-precision accuracy. Two different types of IR solvers are discussed on a wide range of test matrices. A preprocessing step is also developed, which scales and shifts the matrix, if necessary, in order to preserve its positive-definiteness in lower precisions. Our experiments on the V100 GPU show that performance speedups are up to 4.7times against a direct double-precision solver. However, matrix properties such as the condition number and the eigenvalue distribution can affect the convergence rate, which would consequently affect the overall performance.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs

Abstract

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2020
Citations: 7	License type: NO-CC CODE

Similar Papers

Matrix multiplication on batches of small matrices in half and half-complex precisions
Ahmad Abdelfattah ... Jack Dongarra
Journal of Parallel and Distributed Computing | VOL. 145
Ahmad Abdelfattah, et. al.Ahmad Abdelfattah ... Jack Dongarra
15 Jul 2020
Journal of Parallel and Distributed Computing | VOL. 145

Reduction of computing time for seismic applications based on the Helmholtz equation by Graphics Processing Units

-

03 Mar 2015
03 Mar 2015

ARVO-CL: The OpenCL version of the ARVO package — An efficient tool for computing the accessible surface area and the excluded volume of proteins via analytical equations
Shura Hayryan ... Chin-Kun Hu
Computer Physics Communications | VOL. 183
Shura Hayryan, et. al.Shura Hayryan ... Chin-Kun Hu
28 Apr 2012
Computer Physics Communications | VOL. 183

Mixed precision algorithms in numerical linear algebra
Nicholas J Higham ... Theo Mary
Acta Numerica | VOL. 31
Nicholas J Higham, et. al.Nicholas J Higham ... Theo Mary
01 May 2022
Acta Numerica | VOL. 31

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Investigating the Benefit of FP16-Enabled Mixed-Precision Solvers for Symmetric Positive Definite Matrices Using GPUs

Abstract

Talk to us

Similar Papers