Abstract

In this paper, we explore the challenges in scaling on-chip networks towards kilo-core processors. Current low-radix topologies optimize for fast local communication, but do not scale well to kilo-core systems because of the large number of routers required. These increase both power and hop count. In contrast, symmetric high-radix topologies optimize for global communication with fewer hop counts, but degrade local communication with their large, slow routers. To address both local and global communication optimizations independently, we decouple the interconnect design using asymmetric high-radix topologies. By setting a design goal of matching rauter speed with wire speed, our praposed topologies use fast medium-radix rauters to optimize for local communication and a few slow high-radix rauters that reduce hop count to optimize for global communication. Our asymmetric high-radix designs are enabled by recently praposed SwizzleSwitches, which allow us to achieve peiformance scalability within realistic power budgets. We prapose and evaluate two asymmetric high-radix topologies: Super-Star (asymmetric folded Clos) and Super-StarX (asymmetric folded Clos with superimposed mesh). Our evaluations show that the best performing asymmetric high-radix topology improves average network latency over a mesh by 45% while reducing the power consumption by 40%. When compared to symmetric high-radix topologies network thraughput is improved by 2.9× while still praviding similar latency benefits and power ejficiency.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call