Public cloud users are educated to practice horizontal scaling at the application level, with the assumption that more processing capacity can be achieved by adding nodes into the server fleet. In reality, however, applications—even those specifically designed to be horizontally scalable—often face unpredictable scalability issues when running at scale. In this article, we study the limit of horizontal scaling in public clouds by identifying sources of such limitations and quantitatively measuring their impact on processing capacity. To this end, we develop ScaleBench as a distributed and parallel cloud-scale testing framework and propose a capacity degradation index (CDI) to describe the level of capacity degradation observed in our benchmark studies. We have conducted extensive experiments in four real public clouds to identify possible bottlenecks in compute, block storage, networking, and object storage. Further, we carry out large-scale experiments with a real-life video transcoding application on worker fleets with up to 3200 vCPU cores. Our experimental results provide the quantitative evidence on the limit of horizontal scaling in public clouds. This helps cloud users make better design decisions on horizontally scalable applications.