Cooperative Caching and Fetching in D2D Communications - A Fully Decentralized Multi-Agent Reinforcement Learning Approach

Yan Yan,Cheng Li,Baoxian Zhang,Changqing Su

doi:10.1109/tvt.2020.3042089

Abstract

To satisfy the increasing demands of cellular traffic, cooperative content caching at the network edge (e.g., User Equipment) has become a promising paradigm in the next-generation cellular networks. Device-to-Device (D2D) communications can improve the content caching and fetching performance without deploying additional infrastructure. In this paper, we investigate the joint optimization of cooperative caching and fetching in dynamic D2D environment for minimizing the overall content fetching delay. We formulate it as a decentralized partially observable Markov game for finding the optimal policies at agents. To address this problem, we propose a Fully Decentralized Soft Multi-Agent Reinforcement Learning (FDS-MARL) algorithm, which extends the soft actor-critic framework to non-stationary multi-agent environment for fully decentralized learning and it contains the following major design components: Graph Attention Network based self-attention for cooperative inter-agent coordination, a consensus communication mechanism for effectively reducing the information loss and non-stationarity of the environment while keeping gradual global consensus, and an influence based transmission scheduling mechanism for effective credit assignment and also alleviation of potential transmission contentions among agents. Simulation results show that FDS-MARL can improve the content caching and fetching performance significantly compared with the representative work in the literature.

Full Text