Abstract
We are going through the computation from single core to multicore architecture in parallel programming. Graphics Pro- cessor Units (GPUs) have recently emerged as outstanding platforms for data parallel applications with regular data access patterns. However, it is still challenging to optimize computations with irregular data access patterns like sparse matrix-vector multiplication (SPMV). SPMV is one of the most important computational kernels in engineering practice and scientific computation. Various data formats to store the sparse matrix have been implemented on GPUs to maximize the performance. In this paper, we propose and evalu- ate a new implementation of SPMV on GPU based on QCSR storage format which combines the quadtree storage format and CSR format. We also outline some optimization strategies to improve performance. In comparison with previously published implementa- tion, it achieves higher overall performance than BCSR format. The results show that it achieves 1.15 speedup averagely than BCSR format.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.