Abstract
Modern data centers aim to offer very high throughput and ultra-low latency to meet the demands of applications such as online intensive services. Traditional TCP/IP stacks cannot meet these requirements due to their high CPU overhead and high-latency. Remote Direct Memory Access (RDMA) is an approach that can be designed to meet this demand. The mainstream transport protocol of RDMA over Ethernet is RoCE (RDMA over Converged Ethernet), which relies on Priority Flow Control (PFC) within the network to enable a lossless network. However, PFC is a coarse-grained protocol which can lead to problems such as congestion spreading, head-of-the-line blocking. A congestion control protocol that can alleviate these problems of PFC is needed. We propose a protocol, called P4QCN for this purpose. P4QCN is a congestion control scheme for RoCE and it is an improved Quantized Congestion Notification (QCN) design based on P4, which is a flow-level, rate-based congestion control mechanism. P4QCN extends the QCN protocol to make it compatible with IP-routed networks based on a framework of P4 and adopts a two-point algorithm architecture which is more effective than the three-point architecture used in QCN and Data Center QCN(DCQCN). Experiments show that our proposed P4QCN algorithm achieves the expected performance in terms of latency and throughput.
Highlights
Application and storage architecture within data centers are becoming more complicated in recent years
We introduce the Congestion Point (CP)-RP architecture of algorithm and the implementation model of P4QCN based on programmable data plane and protocol independent switch architecture (PISA)
Research revolving around congestion control, applied to high performance data center networks has already been in progress for many years, some of which have been widely used in the data centers network
Summary
Application and storage architecture within data centers are becoming more complicated in recent years. Several new congestion control schemes are proposed to improve performance in data centers, such as QCN [11], DCQCN [12] and so on. DCQCN uses a three-point algorithm architecture which sends congestion feedback from the receiver to sender It may suffer longer round-trip-time for the ECN (Explicit Congestion Notification) control loop when the data center becomes larger with more switches and more layers in the network. We design a congestion control protocol—called P4QCN—which extends the QCN based on P4 for the purpose of alleviating problems of PFC within a lossless network. By the aid of these features of the programmable data plane, P4QCN can achieve a flow-level, rate-based, network-assisted congestion control protocol.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have