Relay selection scheme based on deep reinforcement learning in wireless sensor networks

Dongmei Zhou,Baowan Yan,Cuiran Li,Aihuan Wang,Haixia Wei

doi:10.1016/j.phycom.2022.101799

Abstract

Cooperative communication technology has realized the enhancement in the wireless communication system’s spectrum utilization rate without resorting to any additional equipment; additionally, it ensures system reliability in transmission, increasingly becoming a research focus within the sphere of wireless sensor networks (WSNs). Since the selection of relay is crucial to cooperative communication technology, this paper proposes two different relay selection schemes subject to deep reinforcement learning (DRL), in response to the issues in WSNs with relay selection in cooperative communications, which can be summarized as the Deep-Q-Network Based Relay Selection Scheme (DQN-RSS), as well as the Proximal Policy Optimization Based Relay Selection Scheme (PPO-RSS); it further compared the commonly used Q-learning relay selection scheme (Q-RSS) with random relay selection scheme. First, the cooperative communication process in WSNs is modeled as a Markov decision process, and DRL algorithm is trained in accordance with the outage probability, as well as mutual information (MI). Under the condition of unknown instantaneous channel state information (CSI), the best relay is adaptively selected from multiple candidate relays. Thereafter, in view of the slow convergence speed of Q-RSS in high-dimensional state space, the DRL algorithm is used to accelerate the convergence. In particular, we employ DRL algorithm to deal with high-dimensional state space while speeding up learning. The experimental results reveal that under the same conditions, the random relay selection scheme always has the worst performance. And compared to Q-RSS, the two relay selection schemes designed in this paper greatly reduce the number of iterations and speed up the convergence speed, thereby reducing the computational complexity and overhead of the source node selecting the best relay strategy. In addition, the two relay selection schemes designed and raised in this paper are featured by lower-level outage probability with lower-level energy consumption and larger system capacity. In particular, PPO-RSS has higher reliability and practicability.

Full Text