Unveiling the performance‐energy trade‐off in iterative linear system solvers for multithreaded processors

José I Aliaga,Enrique S Quintana-Ortí,Maribel Castillo,Joaquín Pérez,Juan C Fernández,Germán León,Hartwig Anzt

doi:10.1002/cpe.3341

Abstract

SummaryIn this paper, we analyze the interactions occurring in the triangle performance‐power‐energy for the execution of a pivotal numerical algorithm, the iterative conjugate gradient (CG) method, on a diverse collection of parallel multithreaded architectures. This analysis is especially timely in a decade where the power wall has arisen as a major obstacle to build faster processors. Moreover, the CG method has recently been proposed as a complement to the LINPACK benchmark, as this iterative method is argued to be more archetypical of the performance of today's scientific and engineering applications. To gain insights about the benefits of hands‐on optimizations we include runtime and energy efficiency results for both out‐of‐the‐box usage relying exclusively on compiler optimizations, and implementations manually optimized for target architectures, that range from general‐purpose and digital signal multicore processors to manycore graphics processing units, all representative of current multithreaded systems. Copyright © 2014 John Wiley & Sons, Ltd.

Full Text