Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.

Adela Habib,Joshua Finkelstein,Anders M N Niklasson

doi:10.1021/acs.jctc.4c00584

Abstract

In recent years, a new kind of accelerated hardware has gained popularity in the artificial intelligence (AI) community which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI-hardware, to develop a mixed precision approach for computing a dense matrix factorization of the inverse overlap matrix in electronic structure theory, S-1. This factorization of S-1, written as ZZT = S-1, is used to transform the general matrix eigenvalue problem into a standard matrix eigenvalue problem. Here we present a mixed precision iterative refinement algorithm where Z is given recursively using matrix-matrix multiplications and can be computed with high performance on Tensor cores. To understand the performance and accuracy of Tensor cores, comparisons are made to GPU-only implementations in single and double precision. Additionally, we propose a nonparametric stopping criteria which is robust in the face of lower precision floating point operations. The algorithm is particularly useful when we have a good initial guess to Z, for example, from previous time steps in quantum-mechanical molecular dynamics simulations or from a previous iteration in a geometry optimization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical theory and computation

Lead the way for us

Similar Papers

NVIDIA Tensor Core Programmability, Performance & Precision
Stefano Markidis ... Ivy Bo Peng
-
Stefano Markidis, et. al.Stefano Markidis ... Ivy Bo Peng
01 May 2018
01 May 2018

Reducing shared memory footprint to leverage high throughput on Tensor Cores and its flexible API extension library
Hiroyuki Ootomo ... Rio Yokota
-
Hiroyuki Ootomo, et. al.Hiroyuki Ootomo ... Rio Yokota
27 Feb 2023
27 Feb 2023

Mixed precision LU factorization on GPU tensor cores: reducing data movement and memory footprint
Florent Lopez ... Theo Mary
The International Journal of High Performance Computing Applications | VOL. 37
Florent Lopez, et. al.Florent Lopez ... Theo Mary
03 Jan 2023
The International Journal of High Performance Computing Applications | VOL. 37

Towards Half-Precision Computation for Complex Matrices: A Case Study for Mixed Precision Solvers on GPUs
Ahmad Abdelfattah ... Stanimire Tomov
-
Ahmad Abdelfattah, et. al.Ahmad Abdelfattah ... Stanimire Tomov
01 Nov 2019
01 Nov 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.

Abstract

Talk to us

Similar Papers

More From: Journal of chemical theory and computation