Abstract

With the rapid development of social networks and high-quality video sharing services, the demand for delivering large quantity and high quality contents under stringent end-to-end delay requirement is increasing. To meet this demand, we study the content caching problem modelled as a Markov decision process in the network edge server when the popularity profiles are unknown and time-varying. In order to adapt to the changing trends of content popularity, a context-aware popularity learning algorithm is proposed. We prove that the learning error of this scheme is sublinear in the number of requests. In light of the learned popularities, a reinforcement learning-based caching scheme is designed on top of the state-action-reward-state-action algorithm with a function approximation. A reactive caching algorithm is also proposed to reduce the complexity. The time complexities of both the caching schemes are studied to demonstrate their feasibility in real time systems and a theoretical analysis is performed to prove that the cache hit rate of the reactive caching algorithm asymptotically converges to the optimal cache hit rate. Finally the simulations are presented to demonstrate the superiority of the proposed algorithms.

Highlights

  • With the proliferation of machine-type communications and increasing user demand for video streaming, content caching is emerging as a key technology to reduce delays experienced by users and network congestion

  • reinforcement learning (RL)-based Caching In SARSA [31], an agent interacts with the environment and updates the policy based on the taken actions

  • For the proposed RL-based caching scheme, each time a request arrives, Algorithm 1 is called to estimate the popularity of the requested file, followed by the SARSA algorithm to make a decision with the knowledge of the learned file popularity

Read more

Summary

INTRODUCTION

With the proliferation of machine-type communications and increasing user demand for video streaming, content caching is emerging as a key technology to reduce delays experienced by users and network congestion. It is possible to estimate the global popularity in systems such as the on-demand video streaming, the global popularity may not match the local popularity because the servers locating at the network edge can only serve a small geographical area with limited requests, and the variation tendency of the global popularity and the local popularity can be totally different. These challenges motivate us to study the content caching scheme of an edge server under non-stationary and time-varying popularity profiles without any global information. These schemes need considerable time and could incur an economic cost to acquire the knowledge of popularity, otherwise it can be outdated and deem to be invalid with non-stationary content popularity

Related Works
Contributions
SYSTEM MODEL
Action Space
State Space and Transition Probability
Reward Function
Action-Value Function
CONTEXT-AWARE POPULARITY LEARNING
Context Information Management
Incremental Clustering-Assisted Popularity Learning
CACHING UPDATE ALGORITHMS
RL-based Caching
Reactive Caching
Learning Regret of File Popularity
Learning Regret of CHR
Time Complexity
Simulation Setup
Numerical Results
Correlated File Request Process
Parameter Determination
CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.