Abstract
Simulators help computer architects optimize system designs. The limited performance of simulators even of moderate size and detail makes the approach infeasible for design-space exploration of future exascale systems. Analytic models, in contrast, offer very fast turn-around times. In this paper we propose an analytic multi-core processor-performance model that takes as inputs a) a parametric microarchitecture-independent characterization of the target workload, and b) a hardware configuration of the core and the memory hierarchy. The processor-performance model considers instruction-level parallelism (ILP) per type, models single instruction, multiple data (SIMD) features, and considers cache and memory-bandwidth contention between cores. We validate our model by comparing its performance estimates with measurements from hardware performance counters on Intel Xeon and ARM Cortex-A15 systems. We estimate multi-core contention with a maximum error of 11.4 percent. The average single-thread error increases from 25 percent for a state-of-the-art simulator to 59 percent for our model, but the correlation is still 0.8, a high relative accuracy, while we achieve a speedup of several orders of magnitude. With a much higher capacity than simulators and more reliable insights than back-of-the-envelope calculations it makes automated design-space exploration of exascale systems possible, which we show using a real-world case study from radio astronomy.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.