Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Navneet Garg,Mathini Sellathurai,Tharmalingam Ratnarajah,Vimal Bhatia

doi:10.1109/tcomm.2020.3047658

Abstract

Caching popular contents in advance is an important technique to achieve low latency and reduced backhaul congestion in future wireless communication systems. In this article, a multi-cell massive multi-input-multi-output system is considered, where locations of base stations are distributed as a Poisson point process. Assuming probabilistic caching, average success probability (ASP) of the system is derived for a known content popularity (CP) profile, which in practice is time-varying and unknown in advance. Further, modeling CP variations across time as a Markov process, reinforcement Q-learning is employed to learn the optimal content placement strategy to optimize the long-term-discounted ASP and average cache refresh rate. In the Q-learning, the number of Q-updates are large and proportional to the number of states and actions. To reduce the space complexity and update requirements towards scalable Q-learning, two novel (linear and non-linear) function approximations-based Q-learning approaches are proposed, where only a constant (4 and 3 respectively) number of variables need updation, irrespective of the number of states and actions. Convergence of these approximation-based approaches are analyzed. Simulations verify that these approaches converge and successfully learn the similar best content placement, which shows the successful applicability and scalability of the proposed approximated Q-learning schemes.

Highlights

W ITH the continuous development of various intelligent devices such as smart vehicles, smart home appliances, Manuscript received June 15, 2020; revised September 12, 2020 and December 15, 2020; accepted December 15, 2020
Since the success probability is difficult to analyze with respect to the Poisson point process (PPP) of base stations (BS) ΦBS and the SINR model in (4), we focus on analyzing another point process with a more tractable SINR model as long as both the point processes have statistically equivalency, which is defined as follows
Q-learning algorithm is run for finite states finite policies (FSFP) scenarios with the parameters given as follows: number of popularity profiles in the finite set {p ∈ P}, |P| = 8, the cardinality of the set of caching probabilities |A| = 32, content library size F = 1024, cache size L = 32, decay factor β = 0.1, learning rate β1 = 0.7, the number of steps per episode is 103 and maximum number of episodes is 100

Summary

Introduction

W ITH the continuous development of various intelligent devices such as smart vehicles, smart home appliances, Manuscript received June 15, 2020; revised September 12, 2020 and December 15, 2020; accepted December 15, 2020.

Objectives

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Transactions on Communications	Publication Date: Jan 7, 2021
Citations: 44	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Communications

Lead the way for us

Similar Papers

Linear Approximation based Q-Learning for Edge Caching in Massive MIMO Networks
Navneet Garg ... Mathini Sellathurai
-
Navneet Garg, et. al.Navneet Garg ... Mathini Sellathurai
01 Nov 2019
01 Nov 2019

Content Placement Learning for Success Probability Maximization in Wireless Edge Caching Networks
Navneet Garg ... Mathini Sellathurai
-
Navneet Garg, et. al.Navneet Garg ... Mathini Sellathurai
01 May 2019
01 May 2019

Online Content Popularity Prediction and Learning in Wireless Edge Caching
Navneet Garg ... B N Bharath
IEEE Transactions on Communications | VOL. 68
Navneet Garg, et. al.Navneet Garg ... B N Bharath
16 Dec 2019
IEEE Transactions on Communications | VOL. 68

Modeling and Coverage Analysis of BS-Centric Clustered Users in a Random Wireless Network
Praful D Mankar ... S S Pathak
IEEE Wireless Communications Letters | VOL. 5
Praful D Mankar, et. al.Praful D Mankar ... S S Pathak
01 Apr 2016
IEEE Wireless Communications Letters | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Function Approximation Based Reinforcement Learning for Edge Caching in Massive MIMO Networks

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Transactions on Communications