We recognize that the execution of many dynamic instructions have no consequence on the overall execution of the program. For example, the execution of a correctly predicted conditional branch instruction, as well as all the instructions leading up to it, are inconsequential. We propose a clustered architecture that steers consequential instructions to the primary cluster, and inconsequential ones to the secondary one called the I-Pipe that is less capable and thereby, more area and power efficient. The proposed architecture also entails minimal inter-cluster communication, thereby greatly reducing the complexities of inter-cluster result buses. Such a steering policy also helps increase the performance as the consequential instructions do not face any interference from the inconsequential ones. We demonstrate a \(42\% \) area reduction as compared to a baseline single cluster (Tigerlake-based) architecture, a \(18.5\% \) power reduction in the SPEC CPU2017 suite ( \(13.7\% \) power reduction in GAPBS), and a \(5.15\% \) performance uplift in the SPEC CPU2017 suite ( \(10.22\% \) in the GAPBS suite).
Read full abstract