Energy-Efficient Mode Selection and Resource Allocation for D2D-Enabled Heterogeneous Networks: A Deep Reinforcement Learning Approach

Tao Zhang,Kun Zhu,Junhua Wang

doi:10.1109/twc.2020.3031436

Tao Zhang, Kun Zhu + Show 1 more

https://doi.org/10.1109/twc.2020.3031436

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Improving energy efficiency has shown increasing importance in designing future cellular system. In this work, we consider the issue of energy efficiency in D2D-enabled heterogeneous cellular networks. Specifically, communication mode selection and resource allocation are jointly considered with the aim to maximize the energy efficiency in the long term. And an Markov decision process (MDP) problem is formulated, where each user can switch between traditional cellular mode and D2D mode dynamically. We employ deep deterministic policy gradient (DDPG), a model-free deep reinforcement learning algorithm, to solve the MDP problem in continuous state and action space. The architecture of proposed method consists of one actor network and one critic network. The actor network uses deterministic policy gradient scheme to generate deterministic actions for agent directly, and the critic network employs value function based Q networks to evaluate the performance of the actor network. Simulation results show the convergence property of proposed algorithm and the effectiveness in improving the energy efficiency in a D2D-enabled heterogeneous network.

Full Text