Abstract

Bridge++ is a general-purpose code set for a numerical simulation of lattice QCD aiming at a readable, extensible, and portable code while keeping practically high performance. The previous version of Bridge++ is implemented in double precision with a fixed data layout. To exploit the high arithmetic capability of new processor architecture, we extend the Bridge++ code so that optimized code is available as a new branch, i.e., an alternative to the original code. This paper explains our strategy of implementation and displays application examples to the following architectures and systems: Intel AVX-512 on Xeon Phi Knights Landing, Arm A64FX-SVE on Fujitsu A64FX (Fugaku), NEC SX-Aurora TSUBASA, and GPU cluster with NVIDIA V100.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call