Abstract

One of the major impediments to pre-silicon performance analysis is the ever-increasing sizes of real workloads. This problem makes the use of trace-based simulation methods impractical in time-bound processor development projects. In this paper, we describe a simple method of speeding up trace-driven architectural simulation tools through the use of parallel processing. The PARSIM facility allows the processor performance team to accelerate their existing trace-driven simulation methodology, without having to modify the original trace generation and simulation tools. In achieving speed-up, it is important to ensure that there is no significant loss of accuracy, when compared to runs made on a uniprocessor workstation. PARSIM allows the user to retain accuracy, by automatically adding cache state warm-up preambles for each parallel trace chunk. It also offers built-in options to choose samples from each parallel trace chunk. PARSIM is currently implemented to work on an IBM SP-2 system. We report experimental results for selected benchmark workloads to demonstrate the practical use of this facility.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.