Abstract

Current workstations can offer really amazing raw computational power: up to 10 TFlops on a single machine equipped with multiple CPUs and accelerators as the Intel Xeon Phi or GPU devices. Such results can only be achieved with a massive parallelism of computational devices, thus the actual barrier posed by the exploitation of modern heterogeneous HPC resources is the difficulty in development and/or (performance) efficient porting of software on such architectures. In this paper, we present an experimental study about achievable performance of a widely used, computational intensive application the Fourier Transform, i.e. Discrete Fourier Transform (DFT) and Fast Fourier Transform. We propose an evaluation of the benefits obtained exploiting such resources in terms of performance and programming efforts in the development of the code with a emphasis on the programming approach adopted for code parallelization. With the exception of the interesting performance achieved exploiting GPU for the DFT algorithm, the use state-ofthe- art software libraries provide the best solution since they represent a good compromise to balance programming efforts and performance achievements.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.