Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems

Nicholas J Higham,Mawussi Zounon,Srikara Pranesh

doi:10.1137/18m1229511

Abstract

Motivated by the demand in machine learning, modern computer hardware is increas- ingly supporting reduced precision floating-point arithmetic, which provides advantages in speed, energy, and memory usage over single and double precision. Given the availability of such hardware, mixed precision algorithms that work in single or double precision but carry out part of a compu- tation in half precision are now of great interest for general scientific computing tasks. Because of the limited range of half precision arithmetic, in which positive numbers lie between 6 × 10−8 and 7 × 104, a straightforward rounding of single or double precision data into half precision can lead to overflow, underflow, or subnormal numbers being generated, all of which are undesirable. We develop an algorithm for converting a matrix from single or double precision to half precision. It first applies two-sided diagonal scaling in order to equilibrate the matrix (that is, to ensure that every row and column has ∞-norm 1), then multiplies by a scalar to bring the largest element within a factor θ ≤ 1 of the overflow level, and finally rounds to half precision. The second step ensures that full use is made of the limited range of half precision arithmetic, and θ must be chosen to allow sufficient headroom for subsequent computations. We apply the new algorithm to GMRES-based iterative re- finement (GMRES-IR), which solves a linear system Ax = b with single or double precision data by LU factorizing A in half precision and carrying out iterative refinement with the correction equations solved by GMRES preconditioned with the low precision LU factors. Previous implementations of this algorithm have used a crude conversion to half precision that our experiments show can cause slow convergence of GMRES-IR for badly scaled matrices or failure to converge at all. The new conversion algorithm computes ∞-norms of rows and columns of the matrix and its cost is negligible in the context of LU factorization. We show that it leads to faster convergence of GMRES-IR for badly scaled matrices and thereby allows a much wider class of problems to be solved.

Highlights

The landscape of scientific computing is changing, because of the growing availability and usage of low precision floating-point arithmetic
We look at the percentage of nonzero elements of a matrix that underflow after scaling and rounding to fp16 as well as the performance of GMRES-based iterative refinement
Converting a floating-point matrix to lower precision is not a trivial task when the lower precision format has a much narrower range than the original one, especially when the target is the fp16 arithmetic that is increasingly available in hardware

Summary

Introduction

The landscape of scientific computing is changing, because of the growing availability and usage of low precision floating-point arithmetic. The 2008 revision of IEEE standard 754 introduced a 16-bit floating point format, known as half precision (fp16) [19]. Defined only as a storage format, it has been widely adopted for computing, and is supported by the NVIDIA P100 and V100 GPUs and the AMD Radeon Instinct MI25 GPU. On such hardware, half precision operations run at least twice as fast as single precision ones, and up to 8 times faster on the NVIDIA V100 because of its tensor cores.

Objectives

Findings

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: SIAM Journal on Scientific Computing	Publication Date: Jan 1, 2019
Citations: 49	License type: cc-by

R Discovery Prime

R Discovery Prime

Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SIAM Journal on Scientific Computing

Lead the way for us

Similar Papers

Performance impact of precision reduction in sparse linear systems solvers.
Mawussi Zounon ... Nicholas J Higham
PeerJ Computer Science | VOL. 8
Mawussi Zounon, et. al.Mawussi Zounon ... Nicholas J Higham
17 Jan 2022
PeerJ Computer Science | VOL. 8

Solving Systems of Linear Equations on the CELL Processor Using Cholesky Factorization
J Kurzak ... A Buttari
IEEE Transactions on Parallel and Distributed Systems | VOL. 19
J Kurzak, et. al.J Kurzak ... A Buttari
01 Sep 2008
IEEE Transactions on Parallel and Distributed Systems | VOL. 19

Speeding up the GENGA N-body integrator on consumer-grade graphics cards
R Brasser ... J G Stadel
Astronomy & Astrophysics | VOL. 678
R Brasser, et. al.R Brasser ... J G Stadel
01 Oct 2023
Astronomy & Astrophysics | VOL. 678

Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions
Erin Carson ... Nicholas J Higham
SIAM Journal on Scientific Computing | VOL. 40
Erin Carson, et. al.Erin Carson ... Nicholas J Higham
01 Jan 2018
SIAM Journal on Scientific Computing | VOL. 40

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Squeezing a Matrix into Half Precision, with an Application to Solving Linear Systems

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: SIAM Journal on Scientific Computing