Deep Reinforcement Learning for Dynamic Clustering and Resource Allocation in Smart-Duplex Networks

Dan Wang,Chuan Huang

doi:10.1109/wcnc51071.2022.9771660

Abstract

This paper considers an ultra dense network (UDN) with smart-duplex (SD), which allows the base stations (BSs) to flexibly switch between half-duplex (HD) and full-duplex (FD) modes over time. All the small cells are divided into several clusters, where the BSs in the same cluster jointly serve their users. A Markov decision process (MDP) problem is formulated to maximize the average weighted sum of network throughput and clustering cost for all clusters. To approximately solve this problem, we first adopt an affinity propagation method to determine the number of clusters and the center of each cluster. Then, by treating small cells as agents, the original MDP problem is proved to be equivalent to a multi-agent MDP to maximize the average reward of all small cells. Next, a multi-agent deep reinforcement learning (DRL) algorithm is proposed to jointly implement the dynamic clustering for the non-center small cells, resource allocation, and duplex mode selection. Simulation results show that SD has prominent advantages over both the HD and FD modes in UDNs, and the proposed algorithm outperforms other clustering schemes under the considered scenarios.

Full Text