Evaluating Performance and Energy-efficiency of a Parallel Signal Correlation Algorithm on Current Multi and Manycore Architectures

Arne Hendricks,Thomas Heller,Andreas Schäfer,Max Kasparek,Dietmar Fey

doi:10.1016/j.procs.2016.05.484

Arne Hendricks, Thomas Heller + Show 3 more

Open Access

https://doi.org/10.1016/j.procs.2016.05.484

Copy DOI

Abstract

Increasing variety and affordability of multi- and many-core embedded architectures can pose both a challenge and opportunity to developers of high performance computing applications. In this paper we present a case study where we develop and evaluate a unified parallel approach to a signal-correlation algorithm,currently in-use in a commercial/industrial locating system. We utilize both HPX C++ and CUDA runtimes to achieve scalable code for current embedded multi- and many-core architectures (NVIDIA Tegra, Intel Broadwell M, Arm Cortex A-15). We also compare our approach onto traditional high-performance hardware as well as a native embedded many-core variant. To increase the accuracy of our performance analysis we introduce dedicated performance model. The results show that our approach is feasible and enables us to harness the advantages of modern micro-server architectures, but also indicates that there are limitations to some of the currently existing many-core embedded architectures, that can lead to traditional hardware being superior both in efficiency and absolute performance.

Full Text