Mobile edge caching/computing (MEC) has emerged as a promising approach for addressing the drastic increasing mobile data traffic by bringing high caching and computing capabilities to the edge of networks. Under MEC architecture, content providers (CPs) are allowed to lease some virtual machines (VMs) at MEC servers to proactively cache popular contents for improving users’ quality of experience. The scalable cache resource model rises the challenge for determining the ideal number of leased VMs for CPs to obtain the minimum expected downloading delay of users at the lowest caching cost. To address these challenges, in this paper, we propose an actor-critic (AC) reinforcement learning based proactive caching policy for mobile edge networks without the prior knowledge of users’ content demand. Specifically, we formulate the proactive caching problem under dynamical users’ content demand as a Markov decision process and propose a AC based caching algorithm to minimize the caching cost and the expected downloading delay. Particularly, to reduce the computational complexity, a branching neural network is employed to approximate the policy function in the actor part. Numerical results show that the proposed caching algorithm can significantly reduce the total cost and the average downloading delay when compared with other popular algorithms.
Read full abstract