Abstract

In the midst of progressively shrinking silicon technologies, enhancements in performance and area requirements come at the cost of some side effects. One such cost is the deviation of nominal parameters of process, voltage, and temperature. Considered a manufacturing defect, variations limit the maximum achievable performance. These affected areas possess lower reliability and consume more power than their theoretical counterpart.In this paper, we study and mitigate variations observed in GDDRx based GPU memories. GPU devices are exploited primarily for their massive parallelism and applications with both regular and irregular memory accesses see significant performance benefits. We also find that these consume a significant portion of the total GPU power. We propose a variation mitigating and power-saving technique for GPU memories. This accounts for a power savings of up to 44.61% (20.3% on average). Simultaneously, we also maintain the device data which prevents page faults and re-computation costs. This, however, leads to some performance overheads, which are limited to 4%. Mitigation of process variation combined with state preservation leads to more reliable GPU computations. Additionally, our model lowers access latencies by 15.7% in comparison to a variation affected baseline GPU, which ultimately helps to improve the throughput of the device.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call