With the rapid development of cloud applications, the workload pattern of datacenters presents the characteristics of mixed long and short flows and frequent micro-burst traffic, which puts forward new requirements for network transmission performance, including ultra-low latency, high throughput, and strong stability. At the same time, datacenters begin to deploy the low-diameter topology to accommodate these new requirements, and the high-performance datacenter (HPDC) comes into being. However, due to the complexity of the load, existing congestion control mechanisms cannot control dynamic delay in the network well, which significantly restricts the development of the HPDC. Therefore, it is necessary to deploy the congestion control mechanism for the HPDC. So we propose EagerCC, an ultra-low latency, low-overhead, and accurate congestion control mechanism based on In-Network-Telemetry (INT) information for various datacenter scenarios, especially for HPDCs. EagerCC uses switch-feedback, ACK-padding, ACK-first to reduce feedback delay and uses switch-calculation, probabilistic ACK-padding to reduce the overhead of congestion signals. We conduct a lot of experiments and the result shows that EagerCC performs well in various datacenter scenarios, especially for HPDCs. Specifically, EagerCC reduces the 99th-FCT and avg-FCT by 52.3% and 13.3% for the HPC workload NPB-CG compared to HPCC in Dragonfly. In addition, EagerCC significantly reduces the network’s feedback delay and queue occupancy.
Read full abstract