The following problem arose in connection with studies of Internet web page caching. The general setting is as follows: In some fixed metric space M , k “servers” S1, . . . , Sk are given with some arbitrary initial locations in M . Requests for service at certain points σ1, σ2, σ3, . . . , σN , in M arrive over time. Immediately after request σt is received, exactly one of several mutually exclusive actions must be taken: (i) Some server is moved to σt, with a resulting cost of c(σt), the “cost” of the point σt. (ii) No server moves. In this case, the cost for “no service” is defined to be mink d(Sk, σt), where d(x, y) denotes the distance between x and y in M . A further feature of our model is that two parameters u,w ≥ 0 are specified, which are used as follows. Before having to decide how to service request σt, the servers have at their disposal the knowledge of the u+ w requests σi with t− u ≤ i ≤ t+ w − 1. Thus, the servers can only ”remember” or store the past u requests σt−u, σt−u+1, . . . , σt−1 but are allowed to know the w future requests σt, σt+1, . . . , σt+w−1 before having to service σt. The rules which govern the choices made for servicing all the σt define some algorithm A. In this model, A is deterministic and can only depend on the values of the σi which it currently knows, and nothing else. In particular, A is not allowed to make probabilistic choices based on some source of randomness. We denote by A(σ), the cost of servicing the request sequence σ = (σ1, . . . , σN ). Of course, if we are allowed to know all the σt before having to act, it is very likely the cost of servicing σ can be decreased. Let us denote by OFF(σ) the minimum possible cost of ∗University of California, San Diego †Research supported in part by NSF Grant No. DMS 98-01446 ‡Research supported in part by Bell Communications Research, Morristown, New Jersey §University of California, San Diego ¶AT&T Labs, Florham Park, New Jersey