In modern multi-processor systems-on-chip (MPSoCs), requests from different processor cores, accelerators, and their responses from the lower-level memory contend for the shared cache bandwidth, making it a critical performance bottleneck. Prior research on shared cache management has considered requests from cores but has ignored crucial contributions from their responses. Prior cache bypass techniques focused on data reuse and neglected the system-level implications of shared cache contention. We propose COBRRA, a novel shared cache controller policy that mitigates the contention by aggressively bypassing selected responses from the lower-level memory and scheduling the remaining requests and responses to the cache efficiently. COBRRA is able to improve the average performance of a set of 15 SPEC workloads by 49% and 33% compared to the no-bypass baseline and the best-performing state-of-the-art bypass solution, respectively. Furthermore, COBRRA reduces the overall cache energy consumption by 38% and 31% compared to the no-bypass baseline and the most energy-efficient state-of-the-art bypass solution, respectively.
Read full abstract