Abstract

The increasing number of cores challenges the scalability of chip multiprocessors. Recent studies proposed the idea of disintegration by partitioning a large chip into multiple smaller chips and using silicon interposer-based integration (2.5D) to connect these smaller chips. This method can improve yield, but as the number of small chips increases, the chip-to-chip communication becomes a performance bottleneck. This paper proposes a new network topology, ClusCross, to improve network performance for multicore interconnection networks on silicon interposer-based systems. The key idea is to treat each small chip as a cluster and use cross-cluster long links to increase bisection width and decrease average hop count without increasing the number of ports in the routers. Synthetic traffic patterns and real applications are simulated on a cycle-accurate simulator. Network latency reduction and saturation throughput improvement are demonstrated as compared to previously proposed topologies. Two versions of the ClusCross topology are evaluated. One version of ClusCross has a 10% average latency reduction for coherence traffic as compared to the state-of-the-art network-on-interposer topology, the misaligned ButterDonut. The other version of ClusCross has a 7% and a 10% reduction in power consumption as compared to the FoldedTorus and the ButterDonut topologies, respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.