Abstract

Simulation is an integral tool in performance analysis, however without some knowledge of a simulator's underlying accuracy and limitations, the results may prove wrong or misleading. Timing validation is one aspect of development which is easy to overlook, typically due to the lack of a comparison target at the time the simulator was written. This paper discusses the design and validation of an accurate timing model for an UltraSPARC IIICu-based system. An existing functional simulator was augmented with a cycle-accurate model of the memory hierarchy of a reference system. Key features of the model include the use of a 'bridge' for the processor/memory system interface, the use of event windows between the simulated backplane and processors, implementation of pipelined transactions, and the extension of the processor run loop to support this. The modeling of the store buffer and prefetch mechanisms proved both challenging and important for the model's accuracy. Using a combination of documentation, microbenchmarks, and comparisons of the NAS parallel benchmarks between the simulator and a real machine, it was possible to uncover several undocumented architectural artifacts, and validate the simulator to a reasonable degree. Hardware performance counters and timing information were used to identify the source of discrepancies. Surprisingly, the overhead of introducing the model was within a factor of two, compared with the original functional simulator.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call