A case study of trace-driven simulation for analyzing interconnection networks: cc-NUMAs with ILP processors

V Puente,J.M Prellezo,J.A Gregorio,C Izu,R Beivide

doi:10.1109/empdp.2000.823409

Abstract

The evaluation of network performance under real application loads is carried out by detailed time-intensive and resource-intensive simulations. Moreover, the use of ILP (instruction-level parallel) processors in cc-NUMA (cache-coherent non-uniform memory access) architectures introduces non-deterministic memory accesses; the resulting parallel system must be modeled by a detailed execution-driven simulation, further increasing the evaluation cost. This paper introduces a simulation methodology, based on network traces, to estimate the impact that a given network has on the execution time of parallel applications. This methodology allows the study of the network design space with a level of accuracy close to that of execution-driven simulations but with much shorter simulation times. The network trace, extracted from an execution-driven simulation, is processed to substitute the temporal dependencies produced by the simulated network with an estimation of the message dependencies caused by both the application and the applied cache-coherent protocol. This methodology has been tested on two direct networks, with 16 and 64 nodes respectively, running the FFT and Radix applications of the SPLASH2 suite. The trace-driven simulation is 3 to 4 times faster than the execution-driven one, with an average error of 4% in the total execution time.

Full Text