Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators

Wei Wang,Lifan Xu,Matthew Kay,Howie H Huang,John Cavazos

doi:10.1371/journal.pone.0086484

Abstract

Recent developments in modern computational accelerators like Graphics Processing Units (GPUs) and coprocessors provide great opportunities for making scientific applications run faster than ever before. However, efficient parallelization of scientific code using new programming tools like CUDA requires a high level of expertise that is not available to many scientists. This, plus the fact that parallelized code is usually not portable to different architectures, creates major challenges for exploiting the full capabilities of modern computational accelerators. In this work, we sought to overcome these challenges by studying how to achieve both automated parallelization using OpenACC and enhanced portability using OpenCL. We applied our parallelization schemes using GPUs as well as Intel Many Integrated Core (MIC) coprocessor to reduce the run time of wave propagation simulations. We used a well-established 2D cardiac action potential model as a specific case-study. To the best of our knowledge, we are the first to study auto-parallelization of 2D cardiac wave propagation simulations using OpenACC. Our results identify several approaches that provide substantial speedups. The OpenACC-generated GPU code achieved more than speedup above the sequential implementation and required the addition of only a few OpenACC pragmas to the code. An OpenCL implementation provided speedups on GPUs of at least faster than the sequential implementation and faster than a parallelized OpenMP implementation. An implementation of OpenMP on Intel MIC coprocessor provided speedups of with only a few code changes to the sequential implementation. We highlight that OpenACC provides an automatic, efficient, and portable approach to achieve parallelization of 2D cardiac wave simulations on GPUs. Our approach of using OpenACC, OpenCL, and OpenMP to parallelize this particular model on modern computational accelerators should be applicable to other computational models of wave propagation in multi-dimensional media.

Highlights

Recent developments in the field of high performance computing have greatly expanded the computational capabilities and application of Graphics Processing Units (GPUs)
We found that the sweet spot for our cardiac model is 128|1 for both CUDA and OpenCL implementations running on both GPU cards
OpenACC Implementation Since we have identified the hotspot of the sequential program, we can add OpenACC directives to offload the code block to run on accelerators like GPUs and coprocessors

Summary

Introduction

Recent developments in the field of high performance computing have greatly expanded the computational capabilities and application of Graphics Processing Units (GPUs). GPUs are used in the fields of bioinformatics [1], signal processing [2], astronomy [3], weather forecasting [4], and molecular modeling [5]. In addition to GPUs, Intel’s new Many Integrated Core (MIC) architecture provides a powerful parallel platform for complex computations. The Intel Xeon Phi is the first accelerator based on the MIC architecture and is expected to accelerate oil exploration, climate simulation, and financial analyses, as well as other applications [6]. While new accelerators promise improved computational performance, the use of software tools (like CUDA programming language) to drive parallelism of the accelerators requires expertise that is not widely available. CUDA code will only run on NVIDIA GPUs, thereby limiting its portability to other accelerators

Objectives

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS ONE	Publication Date: Jan 30, 2014
Citations: 42	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE

Lead the way for us

Similar Papers

Chapter 26 - Scalable Out-of-Core Solvers on a Cluster
Eduardo D’Azevedo ... Kwai Wong
High Performance Parallelism Pearls | VOL. -
Eduardo D’Azevedo, et. al.Eduardo D’Azevedo ... Kwai Wong
21 Nov 2014
High Performance Parallelism Pearls | VOL. -

Parallelization and Performance of the NIM Weather Model on CPU, GPU, and MIC Processors
Mark Govett ... Tom Henderson
Bulletin of the American Meteorological Society | VOL. 98
Mark Govett, et. al.Mark Govett ... Tom Henderson
01 Oct 2017
Bulletin of the American Meteorological Society | VOL. 98

Protein-protein docking on hardware accelerators: comparison of GPU and MIC architectures.
Takehiro Shimoda ... Takashi Ishida
BMC systems biology | VOL. Suppl 9 1
Takehiro Shimoda, et. al.Takehiro Shimoda ... Takashi Ishida
21 Jan 2015
BMC systems biology | VOL. Suppl 9 1

Initial results on computational performance of Intel many integrated core, sandy bridge, and graphical processing unit architectures: implementation of a 1D c++/OpenMP electrostatic particle‐in‐cell code
A Vapirev ... J.‐L Cambier
Concurrency and Computation: Practice and Experience | VOL. 27
A Vapirev, et. al.A Vapirev ... J.‐L Cambier
06 Mar 2014
Concurrency and Computation: Practice and Experience | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Fast Acceleration of 2D Wave Propagation Simulations Using Modern Computational Accelerators

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS ONE