Abstract

In a heterogeneous CPU-GPU multicore system that contains various types of computation units as well as on-chip storage units, the on-chip interconnection network is a critical shared resource responsible for sending coherence and memory traffic. On-chip traffic originated from or designated to different components has different performance and throughput requirements. A naive or un-optimized traffic prioritization mechanism usually results in suboptimal system performance. In this work, we quantify the performance/throughput requirements for both CPU and GPU applications, abstract critical information, and propose a network prioritization mechanism which effectively coordinates the on-chip traffic to improve overall system performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call