Abstract
AbstractSpMV (Sparse matrix‐vector multiplication) is an important computing core in traditional high‐performance computing and also one of the emerging data‐intensive applications. For diagonal sparse matrices, it is frequently necessary to fill in a large number of zeros to maintain the diagonal structure as for using DIA (Diagonal) storage format. The fact that filling with zeros may consume additional computing and memory resources, will certainly lead to degradation of the parallel computing performance of SpMV, further causing computing and storage redundancy. To solve the deficiencies of the DIA format, a Two‐stage parallel SpMV method is presented in this paper, which can reasonably distribute the data of diagonal matrix and irregular matrix to different CUDA kernels. As different corresponding compression methods are particularly designed for different matrix forms, a partition‐based hybrid format of DIA and CSR (HPDC) is therefore adopted in the two‐stage method to ensure load balancing among computing resources and continuity of data access on the diagonal. Simultaneously, a standard deviation among blocks is used as a criterion to obtain the optimal number of blocks and distribution of data. The experimental data were implemented in the Florida data set. Compared to DIA, cuSPARSE‐CSR, HDC, and BRCSD, the execution time of the Two‐stage method is shortened by 4, 3.4, 1.9, and 1.15, respectively.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: Concurrency and Computation: Practice and Experience
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.