Renewable energy source (RES)-powered base stations have received tremendous research interest in recent years because they can expand network coverage without building a power grid. This paper proposes a novel user association (UA), resource allocation (RA), and dynamic power control (PC) scheme to maximize the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula> -fairness in RES-assisted small cell networks. The <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula> -fairness is a general notion that flexibly adjusts the balance between the throughput, proportional fairness, and max-min fairness according to <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula> . Nevertheless, none of the existing studies has proposed UA, RA, and PC to maximize the <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$\alpha $ </tex-math></inline-formula> -fairness due to its NP-hardness. Furthermore, fixed-policy-based PC designs cannot consider time-varying environments (e.g., energy harvesting models and wireless channels) of the RES-assisted networks. We first provide a Lagrangian duality-based algorithm to solve the UA and RA problem for a fixed PC. Next, we propose a dynamic PC scheme based on deep reinforcement learning (DRL) that chooses the best PC considering the time-varying environments. However, because the UA and RA algorithm executed in each step of the dynamic PC requires a long computation time, we aim to accelerate the computation of the UA and RA with DRL. Inspired by the Lagrangian duality, we design a DRL-based UA and RA with a low-dimensional continuous variable by relaxing the UA variable, the cardinality of which increases exponentially with the number of base stations and users. The simulation results show that the proposed scheme achieves a 100 times shorter computation time than the optimization-based schemes by computing only two neural networks. In particular, although there have been numerous studies on the proportional fairness maximization, the proposed scheme outperforms the optimization-based schemes in the throughput, proportional fairness, and max-min fairness metrics.
Read full abstract