The content delivery network (CDN) intensively uses cache to push the content close to end users. Over both traditional Internet architecture and emerging cloud-based framework, cache allocation has been the core problem that any CDN operator needs to address. As the first step for cache deployment, CDN operators need to discover or estimate the distribution of user requests in different geographic areas. This step results in a statistical spatial model for the user requests, which is used as the key input to solve the optimal cache deployment problem. More often than not, the temporal information in user requests is omitted to simplify the CDN design. In this paper, we disclose that the spatial request model alone may not lead to truly optimal cache deployment and revisit the problem by taking the dynamic traffic demands into consideration. Specifically, we model the time-varying traffic demands and formulate the distributed cache deployment optimization problem with an integer linear program (ILP). To solve the problem efficiently, we transform the ILP problem into a scalable form and propose a greedy diagram to tackle it. Via experiments over the North American ISPs points of presence (PoPs) network, our new solution outperforms traditional CDN design method and saves the overall delivery cost by 16% to 20%. We also study the impact of various traffic demand patterns to the CDN design cost, via experiments with both real-world traffic demand patterns and extensive synthetic trace data.
Read full abstract