Abstract

This article describes a new high performance implementation of the QR-based Dynamically Weighted Halley Singular Value Decomposition (QDWH-SVD) solver on multicore architecture enhanced with multiple GPUs. The standard QDWH-SVD algorithm was introduced by Nakatsukasa and Higham (SIAM SISC, 2013) and combines three successive computational stages: (1) the polar decomposition calculation of the original matrix using the QDWH algorithm, (2) the symmetric eigendecomposition of the resulting polar factor to obtain the singular values and the right singular vectors, and (3) the matrix-matrix multiplication to get the associated left singular vectors. A comprehensive test suite highlights the numerical robustness of the QDWH-SVD solver. Although it performs up to two times more flops when computing all singular vectors compared to the standard SVD solver algorithm, our new high performance implementation on single GPU results in up to 4× improvements for asymptotic matrix sizes, compared to the equivalent routines from existing state-of-the-art open-source and commercial libraries. However, when only singular values are needed, QDWH-SVD is penalized by performing more flops by an order of magnitude. The singular value only implementation of QDWH-SVD on single GPU can still run up to 18% faster than the best existing equivalent routines.

Highlights

  • Computing the singular value decomposition (SVD) is a critical operation for solving least square problems, determining the pseudoinverse of a matrix, or calculating low-rank matrix approximations, with direct application to signal processing, pattern recognition and statistics [8, 9, 25]

  • First introduced by Nakatsukasa and Higham [24], this new spectral divide and conquer algorithm for SVD is composed of three successive computational stages: (1) computing the polar decomposition of the original matrix using the QR-based Dynamically Weighted Halley (QDWH) algorithm, (2) calculating the symmetric eigendecomposition of the resulting polar factor to obtain the eigenvalues and their associated eigenvectors, and (3) applying the matrix-matrix multiplication to get the remaining left singular vectors

  • Grated into the original QDWH-SVD algorithm so that the first iterations are done in single precision (SP) and the subsequent ones in double precision (DP) in order to be able to recover some of the lost digits

Read more

Summary

Introduction

Computing the singular value decomposition (SVD) is a critical operation for solving least square problems, determining the pseudoinverse of a matrix, or calculating low-rank matrix approximations, with direct application to signal processing, pattern recognition and statistics [8, 9, 25]. A high performance implementation of the overall QDWH-SVD solver framework is described, which integrates the three computational stages: polar decomposition, symmetric eigensolver and matrix-matrix multiplication.

Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.