Abstract

The effect of switch design on the application performance of cache-coherent non-uniform memory access (CC-NUMA) multiprocessors is studied in detail. Wormhole routing and cut-through switching are evaluated for these shared-memory multiprocessors that employ an multistage interconnection network (MIN) and full map directory-based cache coherence protocol. The switch design also considers virtual channels and varying number of input buffers per switch. Based on this, four different switch architectures are presented and compared. The evaluation is based on execution-driven simulation using five different applications to capture the random bursty nature of the network traffic arrival. The round-robin memory management policy is implemented. The authors show that the use of cut-through switching with buffers and virtual channels improves the average message latency tremendously. The waiting times of messages at various stages of switches are also presented. Finally, they show the variation of stall times and execution times for these applications by varying the switch delay and wire width.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call