Abstract

Energy management and power allocation policy is considered for energy harvesting (EH) communications. In this letter, we propose a joint optimization problem with the continuous EH time and transmit power to maximize the long-term throughput based on deep deterministic policy gradient (DDPG). However, the joint optimization problem leads to a large continuous action space. In order to reduce the dimension of action space, we present a deep reinforcement learning (DRL) framework by combining DDPG and convex program. The original problem is decomposed into two-layer optimization subproblems by using the primal decomposition method. The primary problem can be solved by DDPG with a low-dimensional action space. The lower-layer subproblem can be solved by using the existing convex toolbox. Numerical simulation results show that, compared with the existing energy management or power allocation policies for EH communications, the proposed DRL framework can achieve higher long-term throughput.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call