Abstract

In this paper, a joint cluster association and power allocation resource management scheme is proposed in a D2D-NOMA-enabled heterogeneous network. The target is to maximize users' long-term cumulative energy efficiency. A novel <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"><tex-math notation="LaTeX">$K$</tex-math></inline-formula> -times iteration method based on matching theory is constructed to handle the NOMA cluster association, which is formulated as the perfect matching of a weighted bipartite graph. Considering it is difficult to obtain the complete instantaneous channel state information, we offer a deep reinforcement learning (DRL) framework, which guarantees the feasibility of DRL methods for the NP-hard power allocation. Meanwhile, an optimal reward function is designed to enhance the training and interacting efficiency. Based on the proposed DRL framework, the Dinkelbach-TD3 power allocation approach is proposed to shrink the action range of twin delayed deep deterministic policy gradient (TD3) with the fractional programming method Dinkelbach, which elaborately increases the training efficiency and the precision of the DRL methods. Simulation results demonstrate that the proposed DRL framework has significant energy efficiency performance improvement.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call