Abstract

Energy harvesting point-to-point communications are considered. The transmitter harvests energy from the environment and stores it in a finite battery. It is assumed that the transmitter has always data to transmit and the harvested energy is used exclusively for data transmission. As in practical scenarios prior knowledge about the energy harvesting process might not be available, we assume that at each time instant only information about the current state of the transmitter is available, i.e., harvested energy, battery level and channel coefficient. We model the scenario as a Markov decision process and we implement reinforcement learning at the transmitter to find a power allocation policy that aims at maximizing the throughput. To overcome the limitations of traditional reinforcement learning algorithms, we apply the concept of function approximation and we propose a set of binary functions to approximate the expected throughput given the state of the transmitter. Numerical results show that the performance of the proposed approach, which requires only causal knowledge of the energy harvesting process and channel coefficients, has only a small degradation compared to the optimum case which requires perfect non-causal knowledge. Additionally, the proposed approach outperforms naïve policies that assume only causal knowledge at the transmitter.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.