Abstract

A Roe’s flux-difference splitting scheme has been implemented using the NVIDIA CUDA architecture and has been applied to solve the two-dimensional compressible Euler equations. Different standard test cases have been considered in order to estimate the speed-up of GPU computing with respect to CPU calculation. A detailed description of the kernel configuration has been provided and a theoretical analysis of the GPU execution time as a function of the number of threads managed by the kernels is also reported. The loss of performance has been fully described consequent to the use of zero-copy memory. Significant performance improvements have been obtained by using a more recent GPU and CUDA Toolkit. A test case on multi-GPU architecture has been presented in the domain decomposition approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call