This paper presents an algorithm, which we refer to as SGTNE, to efficiently obtain lookahead information from a cluster of processors in a parallel simulation in order to unblock (logical) processes (LP) in a given processor. The SGTNE algorithm is based on a TNE conservative synchronization scheme that relies on an independent execution of a shortest path algorithm in individual processors in order to provide lookahead to the resident LPs. Because TNE is executed on individual processors, it is susceptible to inter-processor deadlocks, which must be detected and broken at some cost. SGTNE (Semi-Global TNE) avoids these deadlocks by executing a shortest path algorithm over a snapshot of the LPs in a cluster of processors. An experimental study of SGTNE was conducted on an Intel Paragon A4. The study compared SGTNE to TNE and to an optimized version of Chandy–Misra (CM) null message algorithms. We also investigated several scheduling algorithms for SGTNE and determined factors influencing its performance, most notably the influence of partitioning. Our results indicate that SGTNE provides good speedup relative to the fastest sequential algorithm and that it out-performs TNE for the population level examined, SGTNE was 3–5 times as fast as the CM-algorithm.
Read full abstract