Improving power efficiency of dense linear algebra algorithms on multi-core processors via slack control

Pedro Alonso,Enrique S Quintana-Orti,Rafael Mayo,Manuel F Dolz

doi:10.1109/hpcsim.2011.5999861

Abstract

This paper addresses the efficient exploitation of task-level parallelism, present in many dense linear algebra operations, from the point of view of both computational performance and energy consumption. In particular, we consider a procedure, the Slack Reduction Algorithm (SRA), to optimize the execution frequency of a collection of tasks (in which many dense linear algebra algorithms can be decomposed) on multi-core architectures. The results from this procedure are modulated by an energy-aware simulator, which is in charge of scheduling/mapping the execution of these tasks to the cores, leveraging dynamic frequency voltage scaling featured by current technology. Simultaneously, the simulator evaluates the performance benefits of the solution. Experiments with these tools show significant energy gains for two key dense linear algebra operations: the Cholesky and QR factorizations.

Full Text