Power management (PM) is cumbersome for today’s computing systems. Attainable performance is bounded by the architecture’s computing efficiency and capped in temperature, current, and power. PM is composed of multiple interacting layers. High-level controllers (HLCs) involve application-level policies, operating system agents (OSPMs), and PM governors and interfaces. The application of high-level control decisions is currently delegated to an on-chip power management unit executing tailored PM firmware routines. The complexity of this structure arises from the scale of the interaction, which pervades the whole system architecture. This paper aims to characterize the cost of the communication backbone between high-level OSPM agents and the on-chip power management unit (PMU) in high performance computing (HPC) processors. For this purpose, we target the System Control and Management Interface (SCMI), which is an open standard proposed by Arm. We enhance a fully open-source, end-to-end FPGA-based HW/SW framework to simulate the interaction between a HLC, a HPC system, and a PMU. This includes the application-level PM policies, the drivers of the operating system-directed configuration and power management (OSPM) governor, and the hardware and firmware of the PMU, allowing us to evaluate the impact of the communication backbone on the overall control scheme. With this framework, we first conduct an in-depth latency study of the communication interface across the whole PM hardware (HW) and software (SW) stack. Finally, we studied the impact of latency in terms of the quality of the end-to-end control, showing that the SCMI protocol can sustain reactive power management policies.
Read full abstract