Abstract

Memory controllers in graphics processing units (GPU) often employ out-of-order scheduling to maximize row access locality. However, this requires complex logic to enable out-of-order scheduling compared with in-order scheduling. To provide a low-cost and low-complexity memory scheduling, we propose an alternative memory scheduling where the memory scheduling is performed not at the destination (i.e., memory controller) but is done at the source (i.e., the cores). We propose two complementary techniques in source-based memory scheduling -- network congestion-aware source throttling and super packets, where multiple request packets are grouped together to create a single super packet. By combing these techniques, the performance across a wide range of application is within 95% of the complex FR-FCFS on average and at significantly lower cost and complexity.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call