Abstract
This paper investigates the reinforcement learning for the relay selection in the delay-constrained buffer-aided networks. The buffer-aided relay selection significantly improves the outage performance but often at the price of higher latency. On the other hand, modern communication systems such as the Internet of Things often have strict requirement on the latency. It is thus necessary to find relay selection policies to achieve good throughput performance in the buffer-aided relay network while stratifying the delay constraint. With the buffers employed at the relays and delay constraints imposed on the data transmission, obtaining the best relay selection becomes a complicated high-dimensional problem, making it hard for the reinforcement learning to converge. In this paper, we propose the novel decision-assisted deep reinforcement learning to improve the convergence. This is achieved by exploring the a-priori information from the buffer-aided relay system. The proposed approaches can achieve high throughput subject to delay constraints. Extensive simulation results are provided to verify the proposed algorithms.
Highlights
W ITH the development of 5G communications, the Internet of Things (IoT) is becoming an increasingly growing topic in the area of wireless networks [1]–[3]
It is known that the relay selection is an efficient way to harvest the diversity gains
In [8], a relay selection scheme combined with feedback and adaptive forwarding in cooperative networks was studied
Summary
W ITH the development of 5G communications, the Internet of Things (IoT) is becoming an increasingly growing topic in the area of wireless networks [1]–[3]. The novel decision-assistant deep reinforcement learning is proposed to improve the convergence This is achieved by exploring the a-priori information from the buffer-aided relay system. Two deep reinforcement algorithms, namely the decisionassisted deep Q-learning and Sarsa respectively, are proposed for the buffer-aided relay selection subject to instantaneous delay constraints. This is different from existing buffer-aided schemes which usually consider average packet delay. For moderate delay constraints, the decision-assisted deep Sarsa can achieve the highest possible throughput in the two-hop relay network, making it a very attractive scheme in practice. The rest of the paper is organized as follows: Section II describes the system model; Section III formulates the problem of the optimum relay selection in the delay-constrained bufferaided relay network; Section IV defines the elements of the reinforcement learning for the relay selection; Section V describes the deep reinforcement learning for the relay selection; Section VI proposes the decision-assisted deep reinforcement learning explores the a-priori information in the relay; Section VII verifies the proposed algorithms with simulation; Section VIII concludes the paper
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.