Fair DMA Scheduler for Low-Latency Accelerator Offloading

Ikuo Otani,Kei Fujimoto,Akinori Shiraga

doi:10.1109/ispa-bdcloud-socialcom-sustaincom57177.2022.00011

Abstract

Accelerators are increasingly being used on general-purpose servers to achieve higher computing performance than can be achieved by CPUs alone. Accelerators are expensive computing components and consume a lot of power, so one accelerator should be shared by multiple applications to save power consumption. However, there is a concern that transfer contention occurs when direct memory access (DMA) transfers are performed to accelerators simultaneously. In this paper, we propose a DMA scheduler that can mitigate the latency increase by DMA transfer contention. In the proposed method, all requests are divided into equally small units of data chunks, and DMA transfers are performed by selecting requests in a fair round-robin manner. This enables small requests that are sensitive to delay increase to be transferred in between transfers of preceding large requests, thus mitigating latency increase by DMA transfer contention. Fair transfer not only makes the proposed method versatile but also prevents excessive transfer latency for large requests. Through performance evaluation, we confirmed that the proposed method improves the transfer time of small requests by 60.0%–82.5%. We also found that the overhead impact of performing DMA transfers in small units is negligible. These results shows that the proposed method effectively reduces DMA transfer contention while having a small overhead impact on the overall accelerator offload system.

Full Text