Abstract

We propose and analyse an on-chip interconnect design for improving the efficiency of multicore processors. Conventional interconnection networks are usually based on a single homogeneous network with uniform processing of all traffic. While the design is simplified, this approach can have performance bottlenecks and limitations on system efficiency. We investigate the traffic pattern of several real world applications. Based on a directory cache coherence protocol, we characterise and categorize the traffic in terms of various aspects. It is discovered that control and unicast packets dominated the network, while the percentages of data and multicast messages are relatively low. Furthermore, we find most of the invalidation messages are multicast messages, and most of the multicast messages are invalidation message. The multicast invalidation messages usually have higher number of destination nodes compared with other multicast messages. These observations lead to the proposed triple class interconnect, where a dedicated multicast-capable network is responsible for the control messages and the data messages are handled by another network. By using a detailed full system simulation environment, the proposed design is compared with the homogeneous baseline network, as well as two other network designs. Experimental results show that the average network latency and energy delay product of the proposed design have improved 24.4% and 10.2% compared with the baseline network.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call