Abstract
SUMMARY We describe the performance of Chicoma, a 3D unstructured mesh compressible flow solver, on graphics processing unit (GPU) hardware. The approach used to deploy the solver on GPU architectures derives from the threaded multicore execution model used in Chicoma, and attempts to improve memory performance via the application of graph theory techniques. The result is a scheme that can be deployed on the GPU with high-level programming constructs, for example, compiler directives, rather than low-level programming extensions. With an NVIDIA Fermi-class GPU (NVIDIA Corp., Sta. Clara, CA, USA) and double precision floating point arithmetic, we observe performance gains of 4–5 × on problem sizes of 106– 107 tetrahedra. We also compare GPU performance to threaded multicore performance with OpenMP and demonstrate hybrid multicore-GPU calculations with adaptive mesh refinement. Published 2012. This article is a US Government work and is in the public domain in the USA.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: International Journal for Numerical Methods in Fluids
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.