Abstract

SummaryIn this paper, we analyze the interactions occurring in the triangle performance‐power‐energy for the execution of a pivotal numerical algorithm, the iterative conjugate gradient (CG) method, on a diverse collection of parallel multithreaded architectures. This analysis is especially timely in a decade where the power wall has arisen as a major obstacle to build faster processors. Moreover, the CG method has recently been proposed as a complement to the LINPACK benchmark, as this iterative method is argued to be more archetypical of the performance of today's scientific and engineering applications. To gain insights about the benefits of hands‐on optimizations we include runtime and energy efficiency results for both out‐of‐the‐box usage relying exclusively on compiler optimizations, and implementations manually optimized for target architectures, that range from general‐purpose and digital signal multicore processors to manycore graphics processing units, all representative of current multithreaded systems. Copyright © 2014 John Wiley & Sons, Ltd.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call