Optimization-Driven Hierarchical Learning Framework for Wireless Powered Backscatter-Aided Relay Communications

Shimin Gong,Dinh Thai Hoang,Bin Lyu,Yuze Zou,Dusit Niyato,Jing Xu

doi:10.1109/twc.2021.3103810

Abstract

In this paper, we employ multiple wireless-powered relays to assist information transmission from a multi-antenna access point to a single-antenna receiver. The wireless relays can operate in either the passive mode via backscatter communications or the active mode via RF communications, depending on their channel conditions and energy states. We aim to maximize the overall throughput by jointly optimizing the transmit beamforming and the relays’ radio modes and operating parameters. Due to the non-convex and combinatorial problem structure, we develop a novel optimization-driven hierarchical deep deterministic policy gradient (H-DDPG) approach to adapt the beamforming and relay strategies. The optimization-driven H-DDPG algorithm firstly decomposes the binary relay mode selection into the outer-loop deep <inline-formula xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink"> <tex-math notation="LaTeX">$Q$ </tex-math></inline-formula> -network (DQN) algorithm and then optimizes the continuous beamforming and relaying strategies by using the inner-loop DDPG algorithm. Secondly, to improve the learning efficiency, we integrate the model-based optimization into the inner-loop DDPG framework by providing a better-informed target estimation for DNN training. Simulation results reveal that these two special designs ensure a more stable learning performance and achieve a higher reward, up to 20%, compared to the conventional model-free DDPG approach.

Full Text