Latency to end-users and regulatory requirements push cloud providers to operate many datacenters all around the globe to host their cloud services. An emerging problem under such geo-distributed architecture is to assign each user request to an appropriate datacenter to benefit both cloud providers (e.g., low bandwidth cost) and end-users (e.g., low latency)—known as request allocation. However, prior request allocation solutions have significant limitations: they either focus only on optimizing the benefits for one entity (e.g., providers or users), or ignore some practical yet indispensable factors (e.g., heterogeneous latency requirements of different users and diverse per unit bandwidth cost among different datacenters) when optimizing benefits for both entities. In this paper, we study the problem of minimizing the total bandwidth cost for cloud service providers while guaranteeing the latency requirement for end-users. Specifically, we formulate an integer programming with consideration of the diversities in both the delay of requests and per unit bandwidth cost of datacenters. To efficiently and practically solve this problem, we first relax the integer programming into a continuous convex optimization and then take the advantages of random sampling to enforce the solution to be a feasible one for the original integer programming. We have conducted rigorous theoretical analysis to prove that our algorithm can provide a considerable good competitive ratio. Extensive simulations demonstrate that our proposed algorithm can reduce the total bandwidth cost by 30% while guaranteeing the latency requirements of all requests, as compared to conventional methods.