Abstract

This paper evaluates the potential of embedded graphic processing units (GPU) in the Nvidia's Tegra K1 for onboard processing. The performance is compared to a general purpose multicore central processing unit (CPU), a full-fledge GPU accelerator, and an Intel Xeon Phi coprocessor, for two representative potential applications, wavelet spectral dimension reduction of hyperspectral imagery and automated cloud-cover assessment (ACCA). For these applications, Tegra K1 achieved 51% performance for the ACCA algorithm and 20% performance for the dimension reduction algorithm, as compared to the performance of the high-end eight-core server Intel Xeon CPU which has a 13.5 times higher power consumption. This paper also shows the potential of modern high-performance computing accelerators for algorithms such as the ones for which the paper presents an optimized parallel implementation. The two algorithms that were tested mostly contain spatially localized computations, and one can assume that all image processing algorithms containing localized computations would exhibit similar speed-ups when implemented on these parallel architectures.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call