Testing and measuring the performance and scalability of cloud-based software services is essential for the further development and improvement of cloud computing. The criteria for the performance of cloud-based software services are intertwined with scalability, flexibility, and efficiency. Especially in the context of quickly growing the quantity of service delivery, the performance evaluation and testing of cloud-based software services is crucial to support the Service Level Agreement (SLA) compliant quality of delivery of these services. In this work, we use the Token bucket method, a scale rate-limiting algorithmic technique, to maximize the efficiency of cloud-based service performance. Increased availability and scalability in distributed systems has been a major obstacle for a long time. For cloud computing businesses to keep their customers happy, keep their trust, and keep their money, they need to make their services easy to use. The Spring Cloud-based software service uses Zuul as its gateway, therefore it must execute rate-limiting and guarantee service stability in the face of rapid growth. Inability to guarantee availability of essential services while using token bucket rate restriction. This research created a method for protecting against token overuse by using a URI configuration file in tandem with the Zuul gateway to potentially filter requests before receiving tokens. The token-bucket rate-limiting technique was created to perform the traffic-limiting function and guarantee the availability of the cloud platform's services. Measuring both scalability and availability helps illustrate the quality of service provided to customers. The performance of cloud-based software services is evaluated with regards to scalability using elasticity measurements. A growth in demand for cloud-based software services is a potential outcome of ongoing research and development in the field of cloud computing.