Abstract

This paper deals with the class imbalance problem in the context of the automatic selection of the best storage format for a sparse matrix with the aim of maximizing the performance of the sparse matrix vector multiplication (SpMV) on GPUs. Our classification method uses convolutional neural networks (CNNs) and proposes several solutions to mitigate the bias toward the majority classes when the data are not balanced. First, the CNNs are trained using images that represent the sparsity pattern of the matrices, whose pixels are colored according to different matrix features. In addition, we introduce a new network called SpNet, which achieves better results than a standard network as AlexNet in terms of prediction accuracy even having a more simple architecture. Finally, sampling techniques and cost-sensitive methods have been studied to give more emphasis on minority classes. The experiments conducted show that our classifiers are able to select the best performing format 92.8% of the time, obtaining 98.3% of the maximum attainable SpMV performance. A comparison to other state-of-the-art classification methods is also provided, demonstrating the benefits of our proposal.

Highlights

  • Sparse matrix-vector multiplication (SpMV) is considered one of the most important computational kernels lying at the heart of many scientific and engineering applications

  • Given that the SpMV performance depends on both the target parallel system and the sparsity structure of the matrix, many existent storage formats have focused on a particular application domain, sparsity pattern and/or computer architecture

  • In this paper we address the automatic classification of sparse matrices to select the best SpMV performing storage format on GPUs using convolutional neural networks (CNNs)

Read more

Summary

INTRODUCTION

Sparse matrix-vector multiplication (SpMV) is considered one of the most important computational kernels lying at the heart of many scientific and engineering applications. We assume that a large set of sparse matrices coming from different application domains and representing a variety of sparsity patterns is available This dataset is the input of the following phases: SpMV benchmarking and image generation. It is necessary to feed the CNN with a set of images labeled according to the best performing storage format (class of the matrix) This data was generated in the previous phases. C. SpMV BENCHMARKING AND IMAGE DATASET GENERATION Matrices should be labeled attending to their best storage format (class) before training a network. SpMV BENCHMARKING AND IMAGE DATASET GENERATION Matrices should be labeled attending to their best storage format (class) before training a network This goal is achieved in the SpMV benchmarking phase. Datasets consist of 256×256 images, which corresponds to the input size for the AlexNet network

NETWORKS AND TRAINING PROCESS
ADDRESSING CLASS IMBALANCE
Findings
CONCLUSIONS
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.