Abstract

We show that typical parallelization strategies used to create parallelized CPU implementations can partially be used to develop efficient GPU implementations, and point out what things should be considered when one compares a CPU with a GPU implementation. For that aim we look at our main concern, the parallelization of a class of lattice group models (LGpMs) for the Boltzmann equation. Thus we give a short overview about the mathematical approach of these models, and then compare the CPU with the GPU architecture to give the basics for the applied parallelization strategies. We do this without going into detail about the underlying C implementation. And finally we use the achieved parallelization speedups of our LGpM parallelization on GPUs and CPUs to compare these two architectures in terms of calculation performance for the given initial costs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.