Abstract

In this study, we present a fast algorithm for the numerical solution of the heat equation. The heat equation models the heat diffusion over time and through a given region. We engage a finite difference method to solve this equation numerically. The performance of its parallel implementation is considered using Message Passing Interface (MPI), Compute Unified Device Architecture (CUDA), and time schemes, such as Forward Euler (FE) and Runge-Kutta (RK) methods. The originality of this study is research on parallel implementations of the fourth-order Runge-Kutta method (RK4) for sparse matrices on Graphics Processing Unit (GPU) architecture. The supreme proprietary framework for GPU computing is CUDA, provided by NVIDIA. We will show three metrics through this parallelization to compare the computing performance: time-to-solution, speed-up, and performance. The spectral method is investigated by utilizing the FFTW software library, based on the computation of the fast Fourier transforms (FFT) in parallel and distributed memory architectures. Our CUDA-based FFT, named CUFFT, is performed in platforms, which is a highly optimized FFTW implementation. We will give numerical tests to reveal that this method is up-and-coming for solving the heat equation. The final result demonstrates that CUDA has a significant advantage and performance since the computational cost is tiny compared with the MPI implementation. This vital performance gain is also achieved through careful attention of managing memory communication and access.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call