Evaluation of the Stretch S6 Hybrid Reconfigurable Embedded CPU Architecture for Power-Efficient Scientific Computing

Thang Viet Huynh,Manfred Mücke,Wilfried N Gansterer

doi:10.1016/j.procs.2012.04.021

Thang Viet Huynh, Manfred Mücke + Show 1 more

Open Access

https://doi.org/10.1016/j.procs.2012.04.021

Copy DOI

Abstract

Embedded CPUs typically use much less power than desktop or server CPUs but provide limited or no support for floating-point arithmetic. Hybrid reconfigurable CPUs combine fixed and reconfigurable computing fabrics to balance better execution performance and power consumption. We show how a Stretch S6 hybrid reconfigurable CPU (S6) can be extended to natively support double precision floating-point arithmetic. For lower precision number formats, multiple parallel arithmetic units can be implemented. We evaluate if the superlinear performance improvement of floating-point multiplication on reconfigurable fabrics can be exploited in the framework of a hybrid reconfigurable CPU. We provide an in-depth investigation of data paths to and from the S6 reconfigurable fabric and present peak and sustained throughput as a function of wide registers used and total operand size. We demonstrate the effect of the given interface when using a floating-point fused multiply-accumulate (FMA) SIMD unit to accelerate the LINPACK benchmark. We identify a mismatch between the size of the S6s reconfigurable fabric and the available interface bandwidth as the major bottleneck limiting performance which makes it a poor choice for scientific workloads relying on native support for floating-point arithmetic.

Full Text