Cooperative edge caching enables edge servers to jointly utilize their cache to store popular contents, thus drastically reducing the latency of content acquisition. One fundamental problem of cooperative caching is how to coordinate the cache replacement decisions at edge servers to meet users’ dynamic requirements and avoid caching redundant contents. Online deep reinforcement learning (DRL) is a promising way to solve this problem by learning a cooperative cache replacement policy using continuous interactions (trial and error) with the environment. However, the sampling process of the interactions is usually expensive and time-consuming, thus hindering the practical deployment of online DRL-based methods. To bridge this gap, we propose a novel Delay-awarE Cooperative cache replacement method based on Offline deep Reinforcement learning (DECOR), which can exploit the existing data at the mobile edge to train an effective policy while avoiding expensive data sampling in the environment. A specific convolutional neural network is also developed to improve the training efficiency and cache performance. Experimental results show that DECOR can learn a superior offline policy from a static dataset compared to an advanced online DRL-based method. Moreover, the learned offline policy outperforms the behavior policy used to collect the dataset by up to 35.9%.
Read full abstract