New high-speed switched networks have reduced the latency of network page transfers significantly below that of local disk. This trend has led to the development of systems that use network-wide memory, or global memory, as a cache for virtual memory pages or file blocks. A crucial issue in the implementation of these global memory systems is the selection of the target nodes to receive replaced pages. Current systems use various forms of an approximate global LRU algorithm for making these selections. However, using age information alone can lead to suboptimal performance in two ways. First, workload characteristics can lead to uneven distributions of old pages across servers, causing increased contention delays. Second, the global memory traffic imposed on a node can degrade the performance of local jobs on that node.This paper studies the potential benefit and the potential harm of using load information, in addition to age information, in global memory replacement policies. Using an analytic queueing network model, we show the extent to which server load can degrade remote memory latency and how load balancing solves this problem. Load balancing requests can cause the system to deviate from the global LRU replacement policy, however. Using trace-driven simulation, we study the impact on application performance of deviating from the LRU replacement policy. We find that deviating from strict LRU, even significantly for some applications, does not affect application performance. Based upon these results, we conclude that global memory systems can gain substantial benefit from load balancing requests with little harm from suboptimal replacement decisions. Finally, we illustrate the use of the intuition gained from the model and simulation experiments by proposing a new family of algorithms that incorporate load considerations as well as age information in global memory replacement decisions.
Read full abstract