Back to Search
Start Over
Efficient Mixed-Precision Matrix Factorization of the Inverse Overlap Matrix in Electronic Structure Calculations with AI-Hardware and GPUs.
- Source :
-
Journal of chemical theory and computation [J Chem Theory Comput] 2024 Aug 13. Date of Electronic Publication: 2024 Aug 13. - Publication Year :
- 2024
- Publisher :
- Ahead of Print
-
Abstract
- In recent years, a new kind of accelerated hardware has gained popularity in the artificial intelligence (AI) community which enables extremely high-performance tensor contractions in reduced precision for deep neural network calculations. In this article, we exploit Nvidia Tensor cores, a prototypical example of such AI-hardware, to develop a mixed precision approach for computing a dense matrix factorization of the inverse overlap matrix in electronic structure theory, S <superscript>-1</superscript> . This factorization of S <superscript>-1</superscript> , written as ZZ <superscript>T</superscript> = S <superscript>-1</superscript> , is used to transform the general matrix eigenvalue problem into a standard matrix eigenvalue problem. Here we present a mixed precision iterative refinement algorithm where Z is given recursively using matrix-matrix multiplications and can be computed with high performance on Tensor cores. To understand the performance and accuracy of Tensor cores, comparisons are made to GPU-only implementations in single and double precision. Additionally, we propose a nonparametric stopping criteria which is robust in the face of lower precision floating point operations. The algorithm is particularly useful when we have a good initial guess to Z , for example, from previous time steps in quantum-mechanical molecular dynamics simulations or from a previous iteration in a geometry optimization.
Details
- Language :
- English
- ISSN :
- 1549-9626
- Database :
- MEDLINE
- Journal :
- Journal of chemical theory and computation
- Publication Type :
- Academic Journal
- Accession number :
- 39136963
- Full Text :
- https://doi.org/10.1021/acs.jctc.4c00584