Parallel Linear Algebra Library Research Articles

HDSS (Huge Dense Linear System Solver) is a Fortran Application Programming Interface (API) to facilitate the parallel solution of very large dense systems to scientists and engineers. The API makes use of parallelism to yield an efficient solution of the systems on a wide range of parallel platforms, from clusters of processors to massively parallel multiprocessors. It exploits out-of-core strategies to leverage the secondary memory in order to solve huge linear systems O ( 100.000 ) . The API is based on the parallel linear algebra library PLAPACK, and on its Out-Of-Core (OOC) extension POOCLAPACK. Both PLAPACK and POOCLAPACK use the Message Passing Interface (MPI) as the communication layer and BLAS to perform the local matrix operations. The API provides a friendly interface to the users, hiding almost all the technical aspects related to the parallel execution of the code and the use of the secondary memory to solve the systems. In particular, the API can automatically select the best way to store and solve the systems, depending of the dimension of the system, the number of processes and the main memory of the platform. Experimental results on several parallel platforms report high performance, reaching more than 1 TFLOP with 64 cores to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors. New version program summary Program title: Huge Dense System Solver (HDSS) Catalogue identifier: AEHU_v1_1 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_1.html Program obtainable from: CPC Program Library, Queenʼs University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 87 062 No. of bytes in distributed program, including test data, etc.: 1 069 110 Distribution format: tar.gz Programming language: Fortran90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system: Linux/Unix Has the code been vectorized or parallelized?: Yes, includes MPI primitives. RAM: Tested for up to 190 GB Classification: 6.5 External routines: MPI ( http://www.mpi-forum.org/), BLAS ( http://www.netlib.org/blas/), PLAPACK ( http://www.cs.utexas.edu/~plapack/), POOCLAPACK ( ftp://ftp.cs.utexas.edu/pub/rvdg/PLAPACK/pooclapack.ps) (code for PLAPACK and POOCLAPACK is included in the distribution). Catalogue identifier of previous version: AEHU_v1_0 Journal reference of previous version: Comput. Phys. Comm. 182 (2011) 533 Does the new version supersede the previous version?: Yes Nature of problem: Huge scale dense systems of linear equations, A x = B , beyond standard LAPACK capabilities. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. Reasons for new version: In many applications we need to guarantee a high accuracy in the solution of very large linear systems and we can do it by using double-precision arithmetic. Summary of revisions: Version 1.1 • Can be used to solve linear systems using double-precision arithmetic. • New version of the initialization routine. The user can choose the kind of arithmetic and the values of several parameters of the environment. Running time: About 5 hours to solve a system with more than 200 000 equations and more than 10 000 right-hand side vectors using double-precision arithmetic on an eight-node commodity cluster with a total of 64 Intel cores.

Read full abstract

We present a Fortran library which can be used to solve large-scale dense linear systems, A x = b . The library is based on the LU decomposition included in the parallel linear algebra library PLAPACK and on its out-of-core extension POOCLAPACK. The library is complemented with a code which calculates the self-polarization charges and self-energy potential of axially symmetric nanostructures, following an induced charge computation method. Illustrative calculations are provided for hybrid semiconductor–quasi-metal zero-dimensional nanostructures. In these systems, the numerical integration of the self-polarization equations requires using a very fine mesh. This translates into very large and dense linear systems, which we solve for ranks up to 3 × 10 5 . It is shown that the self-energy potential on the semiconductor–metal interface has important effects on the electronic wavefunction. Program summary Program title: HDSS (Huge Dense System Solver) Catalogue identifier: AEHU_v1_0 Program summary URL: http://cpc.cs.qub.ac.uk/summaries/AEHU_v1_0.html Program obtainable from: CPC Program Library, Queen's University, Belfast, N. Ireland Licensing provisions: Standard CPC licence, http://cpc.cs.qub.ac.uk/licence/licence.html No. of lines in distributed program, including test data, etc.: 98 889 No. of bytes in distributed program, including test data, etc.: 1 009 622 Distribution format: tar.gz Programming language: Fortran 90, C Computer: Parallel architectures: multiprocessors, computer clusters Operating system: Linux/Unix Has the code been vectorized or parallelized?: Yes. 4 processors used in the sample tests; tested from 1 to 288 processors RAM: 2 GB for the sample tests; tested for up to 80 GB Classification: 7.3 External routines: MPI, BLAS, PLAPACK, POOCLAPACK. PLAPACK and POOCLAPACK are included in the distribution file. Nature of problem: Huge scale dense systems of linear equations, A x = B , beyond standard LAPACK capabilities. Application to calculations of self-energy potential in dielectrically mismatched semiconductor quantum dots. Solution method: The linear systems are solved by means of parallelized routines based on the LU factorization, using efficient secondary storage algorithms when the available main memory is insufficient. The self-energy solver relies on an induced charge computation method. The differential equation is discretized to yield linear systems of equations, which we then solve by calling the HDSS library. Restrictions: Simple precision. For the self-energy solver, axially symmetric systems must be considered. Running time: About 32 minutes to solve a system with approximately 100 000 equations and more than 6000 right-hand side vectors using a four-node commodity cluster with a total of 32 Intel cores.

Read full abstract

Parallel Linear Algebra Library Research Articles

Articles published on Parallel Linear Algebra Library

Bethe–Salpeter equation for absorption and scattering spectroscopy: implementation in the exciting code

On Improving Computational Efficiency of Simplified Fluid Flow Models

A parallel solver for huge dense linear systems

Large-scale linear system solver using secondary storage: Self-energy in hybrid nanostructures

Interfaces for parallel numerical linear algebra libraries in high level languages

A parallel Broyden approach to the Toeplitz inverse eigenproblem

PlapackJava: Towards an efficient Java interface for high performance parallel linear algebra

An annotation language for optimizing software libraries

Design of a parallel linear algebra library for verified computation

A reliable linear algebra library for transputer networks

Solving the least squares problem using a parallel linear algebra library

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Parallel Linear Algebra Library Research Articles

Articles published on Parallel Linear Algebra Library

Bethe–Salpeter equation for absorption and scattering spectroscopy: implementation in the exciting code

On Improving Computational Efficiency of Simplified Fluid Flow Models

A parallel solver for huge dense linear systems

Large-scale linear system solver using secondary storage: Self-energy in hybrid nanostructures

Interfaces for parallel numerical linear algebra libraries in high level languages

A parallel Broyden approach to the Toeplitz inverse eigenproblem

PlapackJava: Towards an efficient Java interface for high performance parallel linear algebra

An annotation language for optimizing software libraries

Design of a parallel linear algebra library for verified computation

A reliable linear algebra library for transputer networks

Solving the least squares problem using a parallel linear algebra library