Abstract

In this paper we conduct a detailed analysis of the sources of power dissipation and energy consumption during the execution of current dense linear algebra kernels on multicore processors, binding these two metrics together with performance to the arithmetic intensity of the operations. In particular, by leveraging the RAPL interface of an Intel E5 ("Sandy Bridge") six-core CPU, we decompose the power-energy duo into its core (mainly due to floating-point units and cache), RAM (off-chip accesses), and uncore components,performing a series of illustrative experiments for a range of memory-bound to CPU-bound high performance kernels. Additionally, we investigate the energy proportionality of these three architecture components for the execution of linear algebra routines on the Intel E5.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call