AAQAL: A Machine Learning-Based Tool for Performance Optimization of Parallel SPMV Computations Using Block CSR

Muhammad Ahmed,Nehad Ali Shah,Adel A Bahadded,M Usman Ashraf,Sardar Usman,Ahmed Mohammed Alghamdi,Khalid Ali Almarhabi

doi:10.3390/app12147073

Abstract

The sparse matrix–vector product (SpMV), considered one of the seven dwarfs (numerical methods of significance), is essential in high-performance real-world scientific and analytical applications requiring solution of large sparse linear equation systems, where SpMV is a key computing operation. As the sparsity patterns of sparse matrices are unknown before runtime, we used machine learning-based performance optimization of the SpMV kernel by exploiting the structure of the sparse matrices using the Block Compressed Sparse Row (BCSR) storage format. As the structure of sparse matrices varies across application domains, optimizing the block size is important for reducing the overall execution time. Manual allocation of block sizes is error prone and time consuming. Thus, we propose AAQAL, a data-driven, machine learning-based tool that automates the process of data distribution and selection of near-optimal block sizes based on the structure of the matrix. We trained and tested the tool using different machine learning methods—decision tree, random forest, gradient boosting, ridge regressor, and AdaBoost—and nearly 700 real-world matrices from 43 application domains, including computer vision, robotics, and computational fluid dynamics. AAQAL achieved 93.47% of the maximum attainable performance with a substantial difference compared to in practice manual or random selection of block sizes. This is the first attempt at exploiting matrix structure using BCSR, to select optimal block sizes for the SpMV computations using machine learning techniques.

Full Text