Deep Reinforcement Learning Based Resource Allocation for D2D Communications Underlay Cellular Networks

Seoyoung Yu,Jeong Woo Lee

doi:10.3390/s22239459

Abstract

In this paper, a resource allocation (RA) scheme based on deep reinforcement learning (DRL) is designed for device-to-device (D2D) communications underlay cellular networks. The goal of RA is to determine the transmission power and spectrum channel of D2D links to maximize the sum of the average effective throughput of all cellular and D2D links in a cell accumulated over multiple time steps, where a cellular channel can be allocated to multiple D2D links. Allowing a cellular channel to be shared by multiple D2D links and considering performance over multiple time steps require a high level of system overhead and computational complexity so that optimal RA is practically infeasible in this scenario, especially when a large number of D2D links are involved. To mitigate the complexity, we propose a sub-optimal RA scheme based on a multi-agent DRL, which operates with shared information in participating devices, such as locations and allocated resources. Each agent corresponds to each D2D link and multiple agents perform learning in a staggered and cyclic manner. The proposed DRL-based RA scheme allocates resources to D2D devices promptly according to dynamically varying network set-ups, including device locations. The proposed sub-optimal RA scheme outperforms other schemes, where the performance gain becomes significant when the densities of devices in a cell are high.

Full Text