The impact of instruction-level parallelism on multiprocessor performance and simulation methodology

V.S Pai,P Ranganathan,S.V Adve

doi:10.1109/hpca.1997.569611

V.S Pai, P Ranganathan + Show 1 more

Open Access

PDF Available

https://doi.org/10.1109/hpca.1997.569611

Copy DOI

Export

Save

Cite

Publication Date: Feb 1, 1997

Citations: 64

Affiliation: Rice University

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

Current microprocessors exploit high levels of instruction-level parallelism (ILP) through techniques such as multiple issue, dynamic scheduling, and non-blocking reads. This paper presents the first detailed analysis of the impact of such processors on shared-memory multiprocessors using a detailed execution-driven simulator. Using this analysis, we also examine the validity of common direct-execution simulation techniques that employ previous-generation processor models to approximate ILP-based multiprocessors. We find that ILP techniques substantially reduce CPU time in multiprocessors, but are less effective in reducing memory stall time. Consequently, despite the presence of inherent latency-tolerating techniques in ILP processors, memory stall time becomes a larger component of execution time and parallel efficiencies are generally poorer in ILP-based multiprocessors than in previous-generation multiprocessors. Examining the validity of direct-execution simulators with previous-generation processor models, we find that, with appropriate approximations, such simulators can reasonably characterize the behavior of applications with poor overlap of read misses. However, they can be highly inaccurate for applications with high overlap of read misses. For our applications, the errors in execution time with these simulators range from 26% to 192% for the most commonly used model, and from -8% to 73% for the most accurate model.

Full Text