Abstract
Fourier and related transforms is a family of algorithms widely employed in diverse areas of computational science, notoriously difficult to scale on high-performance parallel computers with large number of processing elements (cores). This paper introduces a popular software package called P3DFFT implementing Fast Fourier Transforms (FFT) in three dimensions (3D) in a highly efficient and scalable way. It overcomes a well-known scalability bottleneck of 3D FFT implementations by using two-dimensional domain decomposition. Designed for portable performance, P3DFFT achieves excellent timings for a number of systems and problem sizes. On Cray XT5 system P3DFFT attains 45% efficiency in weak scaling from 128 to 65,536 computational cores. Library features include Fourier and Chebyshev transforms, Fortran and C interfaces, in- and out-of-place transforms, uneven data grids, single and double precision. P3DFFT is available as open source at http://code.google.com/p/p3dfft/. This paper discusses P3DFFT implementation and performance in a way that helps guide the user in making optimal choices for parameters of their runs.
Submitted Version (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have