Multi-Agent Reinforcement Learning for Energy Harvesting Two-Hop Communications With a Partially Observable System State

Andrea Ortiz,Tobias Weber,Anja Klein

doi:10.1109/tgcn.2020.3026453

Abstract

We consider an energy harvesting (EH) transmitter communicating with a receiver through an EH relay. The harvested energy is used for data transmission, including the circuit energy consumption. As in practical scenarios, the system's state, comprised by the harvested energy, battery levels, data buffer levels, and channel gains, is only partially observable by the EH nodes. Moreover, the EH nodes have only outdated knowledge regarding the channel gains for their own transmit channels. Our goal is to find distributed transmission policies aiming at maximizing the throughput. A channel predictor based on a Kalman filter is implemented in each EH node to estimate the current channel gain for its own channel. Furthermore, to overcome the partial observability of the system's state, the EH nodes cooperate with each other to obtain information about their parameters during a signaling phase. We model the problem as a Markov game and propose a multi-agent reinforcement learning algorithm to find the transmission policies. We show the trade-off between the achievable throughput and the signaling required, and provide convergence guarantees for the proposed algorithm. Results show that even when the signaling overhead is taken into account, the proposed algorithm outperforms other approaches that do not consider cooperation.

Full Text