The presence of content stores in routers enables in-network caching in Named Data Networks (NDN) to improve customer experience, especially when content distribution is considered. Typically, homogeneous caching is assumed in the network where each router operates its cache according to the installed policy. Numerous in-network caching policies that vary in admission logic and caching attributes have been urged. Hence, the policies vary in performance (efficiency) according to unbounded network factors such as caching capacity, users’ request pattern, content popularity distribution, and application type (amongst other contexts). A single scheme may not effectively realize all the network characteristics and vandalize the efficient use of precious cache spaces. Therefore, a new strategy capable of dynamically selecting the optimal caching policy under varying network contexts is inevitable. In this direction, we propose a reinforcement learning (RL)-based hybrid caching strategy, namely Cache-MAB, where the routers work in a distributed manner and learn to pick the most suitable policy for caching a content. The set of policies is formed by picking the policies that assume similar caching attributes for caching decisions. The proposed strategy decouples the caching policy selection from the admission logic used by the selected policy and models the suitable policy selection as a Multi-armed Bandit (MAB) problem. The hybrid strategy equips each router with a diverse set of caching policies and an RL agent that utilizes a reinforcement algorithm to solve the MAB problem and, hence, select the optimal policy in different cases. The optimal policy is the one that maximizes the local performance, for instance, the cache hit rate. Simulation results confirm that the hybrid strategy effectively achieves the optimal policy’s performance, having a performance gap of less than 1%, by adapting to different network scenarios.
Read full abstract