Abstract

Modern data center applications need high throughput (40Gbps) and ultra-low latency (<10us per hop), along with low CPU overhead. Remote Direct Memory Access (RDMA), which can be deployed in RDMA over commodity Ethernet (RoCEv2) protocol, has the potential to satisfy the requirements. RoCEv2 needs a lossless environment to achieve high performance. RoCEv2 provides Priority-based Flow Control (PFC) to prevent packet loss caused by buffer overflow. But packet loss can still happen in today’s data centers due to other reasons such as switch configuration error. There are two retransmission algorithms dealing with the packet loss recovery: Go-Back-0 and Go-Back-N. Unfortunately, by simply applying Go-Back-N algorithm to RoCEv2, the relative throughput will drop to nearly zero when the packet loss rate exceeds 1%. This is mainly caused by the improper triggering mechanism of generating NAK. This paper proposed an Improved Go-Back-N algorithm to solve this problem, which involves two mechanism. The Improved Go-Back-N is easy to be deployed in today’s data centers because it makes no changes on switches. It can improve the relative throughput to about 60% when the packet loss rate increases to 1%.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call