Abstract

Improving energy efficiency has shown increasing importance in designing future cellular system. In this work, we consider the issue of energy efficiency in D2D-enabled heterogeneous cellular networks. Specifically, communication mode selection and resource allocation are jointly considered with the aim to maximize the energy efficiency in the long term. And an Markov decision process (MDP) problem is formulated, where each user can switch between traditional cellular mode and D2D mode dynamically. We employ deep deterministic policy gradient (DDPG), a model-free deep reinforcement learning algorithm, to solve the MDP problem in continuous state and action space. The architecture of proposed method consists of one actor network and one critic network. The actor network uses deterministic policy gradient scheme to generate deterministic actions for agent directly, and the critic network employs value function based Q networks to evaluate the performance of the actor network. Simulation results show the convergence property of proposed algorithm and the effectiveness in improving the energy efficiency in a D2D-enabled heterogeneous network.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.