Abstract

This study aims to address the power allocation problem to maximize the sum of the generalized mutual information, which refers to the achievable rate with imperfect channel state information, through a reinforcement learning (RL) approach in energy harvesting communications. In contrast to the conventional deep RL applications, which incur a large computational load on the devices due to the use of deep neural networks, we adopt shallow RL architectures involving the optimal structural properties pertaining to the optimal power allocation policy. To design the shallow architectures that can fully capture the desired power allocation policy, we derive the partial monotonicity of and bounds on the policy and value functions. These structural properties represent mathematical bases on which to construct the shallow architecture. We use a deterministic policy gradient method with monotonically shape-constrained approximators that allow us to avoid using overly complicated deep neural networks, which are not suitable for low-power devices. Through various experiments, we visualize the solutions derived from the proposed shallow architectures and demonstrate that the proposed method outperforms existing power allocation policies and exhibits a greater robustness due to optimal structural properties.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.