Abstract

Weighted round-robin (WRR) arbitration provides global fairness in networks-on-chip (NoCs) as opposed to the commonly used round-robin and priority-based arbitration techniques. However, the large number of weights explodes the design space and exacerbates performance (latency-throughput) tuning. Therefore, fast and accurate performance analysis techniques for NoCs are crucial for accelerating design space exploration and accurate pre-silicon evaluation. This article presents the first comprehensive performance analysis technique for NoCs with WRR arbitration and finite buffers. It can handle bursty traffic and is scalable to large NoC sizes. The proposed technique first estimates the probability that a queue is full and uses this result to compute the modified service time and queuing delay. Thorough experimental evaluations with synthetic traffic and real applications show that the proposed analytical model is always more than 10% accurate compared to cycle-accurate simulations. Moreover, the proposed performance analysis technique is five orders of magnitude faster than cycle-accurate simulations for a 16 <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\times$</tex-math> </inline-formula> 16 mesh NoC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call