Abstract

Device-to-device (D2D) communications are envisioned as a critical technology to support future ubiquitous mobile communications applications. However, the requirements of high mobility and low latency seriously constrain the performance improvement of D2D communications. In this letter, we investigate a deep reinforcement learning (DRL) based scheme of power and resource allocation for maximizing the throughput of D2D users (DUEs) and cellular users (CUEs). Based on DRL theory, D2D pairs are defined as distributed multiple agents, which aim to adaptively select the transmission power and resources to ease the co-channel interference without any prior information. Furthermore, a priority sampling based dueling double deep Q-network (PS-D3QN) distributed algorithm is proposed to help agents to learn the predominant features. Simulation results show that the proposed algorithm has higher throughput than the existing DRL algorithms with a strict delay constraint. Particularly, the probability that the agent selects high power decreases with the increase of remaining transmission time, which proves the agents effectively learn and dynamically sense the impacts of the delay constraint.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call