Abstract

Parallel computing techniques are applied to a linear acoustic wave model to reduce execution time. Three parallel computing models are developed to parallelize computations. The fork-and-join, SPMD and SIMT models define the execution of parallel computations. The precision and efficiency of the linear acoustic wave model are improved through substantial speedups in all implementations. Furthermore, axisymmetric properties of certain acoustic fields lead to a reduction in the spatio-temporal complexity of those acoustic fields by removing redundant computations. The same linear acoustic wave model is also modified and extended to describe wave propagation across multiple media instead of only a single medium. The developed implementations are integrated into a particularly useful package for high performance simulation of two- or three-dimensional linear acoustic elds generated by realistic sources in various fluid media.

Highlights

  • Processing units with multiple cores are common due to advances in technology

  • The sequential algorithm is studied and data-level parallelism inherent in the algorithm is exploited with multiple high performance computing (HPC) approaches that each offer certain advantages and disadvantages

  • With multiple Open Multi-Processing (OpenMP) threads, the linear acoustic simulation exploits the full capabilities of the multi-core architectures prevalent today

Read more

Summary

Introduction

Processing units with multiple cores are common due to advances in technology. CPUs composed of multiple cores provide an indication of a trend in performance enhancement towards greater throughput as opposed to faster processor clock cycle speed [5, 6]. Aside from CPUs with multiple cores, the development of GPUs and their application to areas outside of graphics provides an alternative means to increasing execution speed. Cache memory in a GPU is limited when compared to cache memory on a CPU The reason for these features in the design of GPUs is rooted in the nature of graphics where greater throughput is desired for multiple highly independent operations that may be performed in parallel. Both CPUs and GPUs rely on the concept of threads as a unit of parallel execution.

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.