Abstract

In fog-assisted Internet-of-Things systems, it is a common practice to cache popular content at the network edge to achieve high quality of service. Due to uncertainties, in practice, such as unknown file popularities, the cache placement scheme design is still an open problem with unresolved challenges: 1) how to maintain time-averaged storage costs under budgets; 2) how to incorporate online learning to aid cache placement to minimize performance loss [also known as (a.k.a.) regret]; and 3) how to exploit offline historical information to further reduce regret. In this article, we formulate the cache placement problem with unknown file popularities as a constrained combinatorial multiarmed bandit problem. To solve the problem, we employ virtual queue techniques to manage time-averaged storage cost constraints, and adopt history-aware bandit learning methods to integrate offline historical information into the online learning procedure to handle the exploration–exploitation tradeoff. With an effective combination of online control and history-aware online learning, we devise a cache placement scheme with history-aware bandit learning called <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">CPHBL</i> . Our theoretical analysis and simulations show that CPHBL achieves a sublinear time-averaged regret bound. Moreover, the simulation results verify CPHBL’s advantage over the deep reinforcement learning-based approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call