Low-power processors for the Internet-of-Things (IoT) demand a high degree of adaptability to efficiently execute applications with different resource requirements under varying scenarios. Current single-ISA heterogeneous Chip Multiprocessors (CMPs), such as ARM's big.LITTLE, provide multiple cores and voltage/frequency levels to address this challenge. However, finding the best possible type of core and the corresponding voltage/frequency level for all the execution scenarios, which involve different applications and phases, remains far from being reached. In this article, we propose extending such a single-ISA heterogeneous CMP with a Coarse-Grained Reconfigurable Array (CGRA) and a hardware-based dynamic binary translation (DBT) module that transparently maps application code onto the CGRA for acceleration. To achieve low-energy levels and efficiently manage the power consumption of the CGRA, we introduce an additional voltage rail that enables operation in the Near-Threshold Voltage (NTV) regime when needed, leveraging key features of the CGRA's structure to address the implementation challenges of NTV computing. For less than 35 percent area overhead to the baseline CMP, performance and energy consumption are improved as follows. Compared to: (a) power-efficient execution in the LITTLE core, MuTARe achieves 29 percent reduction in energy consumption, and $2\times$ 2 × speedup; (b) performance-efficient execution in the big core, a speedup of $1.6\times$ 1 . 6 × with an energy reduction of 41 percent is achieved.
Read full abstract