Abstract

The streamed storage format for sparse matrices showed good performance improvement for sparse matrix and vector multiply (SpMV) compared with compressed sparse row (CSR) and block CSR (BCSR) formats, particularly on IBM Power processors. We extend the format to exploit single instruction multiple data (SIMD) instructions in order to utilize the vector unit, and discuss how the streamed formats perform on the Power7 processor, which is the first eight-core chip from IBM. The streamed format is then applied to two more operations of sparse matrices, successive over-relaxation (SOR) iteration sweeps and incomplete lower and upper (ILU) triangular solvers. Basic solvers are developed for them in the high-performance computing (HPC) package PETSc. Test results on the IBM Power7 processor show that the SIMD instructions improve the performance of the streamed storage format on SpMV. The format also accelerates SOR iteration sweeps and ILU matrix solvers, compared with the traditional BCSR format used in PETSc.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call