Large cloud providers including AWS, Azure, and Google Cloud offer two tiers of network services to their customers: one class uses the providers' private wide area networks (WAN-transit) to carry a customer's traffic as much as possible, and the other uses the public internet (inet-transit). Little is known about how each cloud provider configures its network to offer different transit services, how well these services work, and whether the quality of those services can be further improved. In this work, we conduct a large-scale study to answer these questions. Using RIPE Atlas probes as vantage points, we explore how traffic enters and leaves each cloud's WAN. In addition, we measure the access latency of the WAN-transit and the inet-transit service of each cloud and compare it with that of an emulated performance-based routing strategy. Our study shows that despite the cloud providers' intention to carry customers' traffic on its WAN to the maximum extent possible, for about 12% (Azure) and 13% (Google) of our vantage points, traffic exits the cloud WAN early at cloud edges more than 5000km away from the vantage points' nearest cloud edges. In contrast, more than 84% (AWS), 78% (Azure), and 81% (Google) of vantage points enter a cloud WAN within a 500km radius of their respective locations. Moreover, we find that cloud providers employ different routing strategies to implement the inet-transit service, leading to transit policies that may deviate from their advertised service descriptions. Finally, we find that a performance-based routing strategy can significantly reduce latencies in all three cloud providers for 4% to 85% of vantage point and cloud region pairs.
Read full abstract