Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver

Xiaoye S Li,Paul Lin,Yang Liu,Piyush Sao

doi:10.1145/3577197

Abstract

We present the new features available in the recent release of SuperLU_DIST , Version 8.1.1. SuperLU_DIST is a distributed-memory parallel sparse direct solver. The new features include (1) a 3D communication-avoiding algorithm framework that trades off inter-process communication for selective memory duplication, (2) multi-GPU support for both NVIDIA GPUs and AMD GPUs, and (3) mixed-precision routines that perform single-precision LU factorization and double-precision iterative refinement. Apart from the algorithm improvements, we also modernized the software build system to use CMake and Spack package installation tools to simplify the installation procedure. Throughout the article, we describe in detail the pertinent performance-sensitive parameters associated with each new algorithmic feature, show how they are exposed to the users, and give general guidance of how to set these parameters. We illustrate that the solver’s performance both in time and memory can be greatly improved after systematic tuning of the parameters, depending on the input sparse matrix and underlying hardware.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: ACM Transactions on Mathematical Software	Publication Date: Mar 21, 2023
Citations: 5	License type: other-oa

R Discovery Prime

R Discovery Prime

Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Mathematical Software

Lead the way for us

Similar Papers

Addressing Irregular Patterns of Matrix Computations on GPUs and Their Impact on Applications Powered by Sparse Direct Solvers
Ahmad Abdelfattah ... Wajih Boukaram
-
Ahmad Abdelfattah, et. al.Ahmad Abdelfattah ... Wajih Boukaram
01 Nov 2022
01 Nov 2022

Direct solution of larger coupled sparse/dense linear systems using low-rank compression on single-node multi-core machines in an industrial context
Emmanuel Agullo ... Guillaume Sylvand
-
Emmanuel Agullo, et. al.Emmanuel Agullo ... Guillaume Sylvand
01 May 2022
01 May 2022

Viability Study of SYCL as a Unified Programming Model for Heterogeneous Systems Based on GPUs in Bioinformatics
Manuel Costanzo
Journal of Computer Science and Technology | VOL. 24
Manuel CostanzoManuel Costanzo
18 Oct 2024
Journal of Computer Science and Technology | VOL. 24

Exploring the possibility of a hipSYCL-based implementation of oneAPI
Aksel Alpay ... Holger Wünsche
-
Aksel Alpay, et. al.Aksel Alpay ... Holger Wünsche
10 May 2022
10 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Newly Released Capabilities in the Distributed-Memory SuperLU Sparse Direct Solver

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Mathematical Software