Abstract
This paper revisits Peer-to-Peer DMA (P2P DMA) and investigates its potential for exploitation on Ethernet NICs and NVMe SSDs. The slowing performance improvement of CPUs has led to emergence of peripheral accelerators such as Smart NICs and TPU. P2P DMA presents potential for the efficient integration of multiple peripherals by avoiding data bouncing on main memory. However, P2P DMA has been studied mainly around GPUs, and its improvement has been measured for specific applications. In this paper, we perform experiments to clarify the benefits of using P2P DMA on individual devices, i.e., an Ethernet NIC and an NVMe SSD, from an I/O throughput perspective. We developed a library, called Libpop, for manipulating memory on devices for invoking P2P DMA. Additionally, we integrated Libpop into pcie-bench, which is an FPGA-based benchmark device, netmap for Ethernet NICs, and UNVMe for NVMe SSDs. Experiments with these implementations show that (1) memory writes degrade the throughput of DMA write by 70%, (2) the degradation affects I/O throughput on the devices, and (3) P2P DMA can avoid degradation, but device queues affect throughput on the Ethernet NIC.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.