Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU

Mohammed Mahmoud,Mark Hoffmann,Hassan Reza

doi:10.3390/computation6030045

Mohammed Mahmoud, Mark Hoffmann + Show 1 more

Open Access

https://doi.org/10.3390/computation6030045

Copy DOI

Journal: Computation	Publication Date: Aug 24, 2018
Citations: 5	License type: CC BY 4.0

Affiliation: University of North Dakota

Abstract

Sparse matrix-vector multiplication (SpMV) can be used to solve diverse-scaled linear systems and eigenvalue problems that exist in numerous, and varying scientific applications. One of the scientific applications that SpMV is involved in is known as Configuration Interaction (CI). CI is a linear method for solving the nonrelativistic Schrödinger equation for quantum chemical multi-electron systems, and it can deal with the ground state as well as multiple excited states. In this paper, we have developed a hybrid approach in order to deal with CI sparse matrices. The proposed model includes a newly-developed hybrid format for storing CI sparse matrices on the Graphics Processing Unit (GPU). In addition to the new developed format, the proposed model includes the SpMV kernel for multiplying the CI matrix (proposed format) by a vector using the C language and the Compute Unified Device Architecture (CUDA) platform. The proposed SpMV kernel is a vector kernel that uses the warp approach. We have gauged the newly developed model in terms of two primary factors, memory usage and performance. Our proposed kernel was compared to the cuSPARSE library and the CSR5 (Compressed Sparse Row 5) format and already outperformed both.

Highlights

A Graphics Processing Unit (GPU) is an electronic chip that is designed for extremely fast parallel computations and processing of data
We are proposing a new model for storing Configuration Interaction (CI) sparse matrices on the GPU
We have implemented the kernel of the Sparse matrix-vector multiplication (SpMV) operation for the proposed model

Summary

Introduction

A Graphics Processing Unit (GPU) is an electronic chip that is designed for extremely fast parallel computations and processing of data. The constant memory can be accessed by by all the threads within the grid, just like the global memory. The CPU launches on the GPU multiple copies of the kernel on parallel threads to process the GPU data. The following code is a simple GPU kernel (written in CUDA) called AddArrays that is used for the purpose of adding two integer arrays In this case, the CPU launches on the GPU multiple copies of the AddArrays kernel on parallel threads (one kernel per thread) in order to perform the addition operation. CUDA allows us to run millions of threads or more, programs that run on the GPU aren’t million times faster than the CPU for multiple reasons: It takes time to copy data from the CPU to the GPU and vice versa. As a matter of fact, improving the SpMV operation is extremely critical to the performance of a variety of scientific applications

The Schrödinger Equation

E: Energy

The CI Matrix Elements

The CI Matrix

The Proposed Work

Common Formats

The Sliced ELLPACK Format

The Sliced ELLPACK-R Format

The Proposed Model

The Developed SpMV Kernel

System Configuration

The Results

32. The proposed

Conclusions

Future Work

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation

Lead the way for us

Similar Papers

An Efficient Storage Format for Storing Configuration Interaction Sparse Matrices on CPU/GPU
Mohammed Mahmoud ... Hassan Reza
-
Mohammed Mahmoud, et. al.Mohammed Mahmoud ... Hassan Reza
01 Dec 2017
01 Dec 2017

Sparse Matrix-Vector Multiplication Optimizations based on Matrix Bandwidth Reduction using NVIDIA CUDA
Shiming Xu ... Wei Xue
-
Shiming Xu, et. al.Shiming Xu ... Wei Xue
01 Aug 2010
01 Aug 2010

Study on GPU-accelerated extraction of interconnects parasitic using CUDA and MPI
... Guoqiang Liu
-
, et. al. ... Guoqiang Liu
01 May 2010
01 May 2010

CUDA by Example: An Introduction to General-Purpose GPU Programming

Scalable Computing Practice and Experience | VOL. 11

01 Jan 2009
Scalable Computing Practice and Experience | VOL. 11

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Developing a New Storage Format and a Warp-Based SpMV Kernel for Configuration Interaction Sparse Matrices on the GPU

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Computation