Abstract

Compared to commonly used loss-based congestion control algorithms predominantly used in Transmission Control Protocol (TCP) implementations, congestion-based congestion control called BBR has shown much better performance in resource-abundant modern communication links. However, for a high influx of TCP sessions on the bottleneck switch, clusters in High-Performance Compute (HPC) nodes and data centers face resource constraints because of the immense workload during orchestration and relocation of workflows across the resource pool. This article discusses how to resolve this problem, commonly known as TCP incast, through efficient queue management of the bottleneck link and adding a shaper function in the standard BBR algorithm. We analyzed TCP incast issue for two efficient versions of congestion control i.e., BBR and CUBIC (named after the cubic function used instead of linear function), in a highly overloaded convergent switch of the cluster. It is noticeable that the queuing delay and buffer build-up are two essential parameters in causing TCP Incast. Hence, we used the M/G/1/B queuing model when multiple TCP sessions generate the network traffic and different buffer build-up scenarios are analyzed in the bottleneck node of HPC clusters. Based on the findings of our queuing analysis, we propose an incast recovery BBR algorithm that introduces additional controls like Incast shaping to deal with queue build-up during TCP incast. The effects of these modifications in BBR implementation are studied in terms of performance parameters like flow completion time, throughput, RTT variations, and fairness to other competing flows are significant compared with standard BBR and CUBIC implementations.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call