Abstract

This paper considered a buffer-aided relaying system with multiple source-destination pairs and a relay node (RN) with energy harvesting (EH) capability. The RN harvests energy from the ambient environment and uses the harvested energy to forward the sources' information packet to the corresponding destinations. It is assumed that information on the EH and channel gain processes is unavailable. Thus, the model free deep reinforcement learning (DRL) method, specifically the deep Q-learning, is applied to learn an optimal link selection policy directly from historical experience to maximize the system utility. In addition, by taking advantage of the structural features of the considered system, a pretraining scheme is proposed to accelerate the training convergence of the deep Q-network. Experiment results show that the proposed pretraining method can significantly reduce the training time required. Moreover, the performance of the transmission policy obtained by using deep Q-learning is compared with that of several conventional transmission schemes. It is shown that the transmission policy obtained by using our proposed model can achieve better performance.

Highlights

  • Cooperative relaying communication, in which a relay node helps to forward the source’s information to destination, is capable of attaining significant throughput and reliability improvements [1]

  • To resolve the optimization problem with Deep Q-learning, we propose to rewrite the problem into a state, action, and reward form

  • PROPOSED deep reinforcement learning (DRL)-BASED ACCESS CONTROL WITH PRETRAINING we present the DRL-based access control for a buffer-aided relaying system with energy harvesting

Read more

Summary

INTRODUCTION

Cooperative relaying communication, in which a relay node helps to forward the source’s information to destination, is capable of attaining significant throughput and reliability improvements [1]. H. Zhang et al.: Deep Reinforcement Learning-Based Access Control for Buffer-Aided buffer-aided relaying system was studied in [9], where both the source node and the RN harvest energy from the environment. In [10], the delay constraint for a buffer-aided relaying system with energy harvesting was considered and the system throughput was investigated In these works, it was assumed that the EH and channel gain processes are either known ahead of time (no-causal case) or their distributions are available. In practical systems, these information on EH and channel gain processes are usually unknown To overcome such challenges, model free technologies has been discussed for designing efficient transmission strategies [11]–[14]. Experiment results show that as compared with the traditional methods, our model can achieve better performance

SYSTEM MODEL
ACTION SPACE
TRANSITION FUNCTION
REWARD FUNCTION
OPTIMIZATION PROBLEM
21: Perform gradient descent and fine tuning on Q
FINE TUNING WITHIN AUTHENTIC ENVIRONMENTS
EXTENSION
EXPERIMENTS
CANONICAL VARIATE ANALYSIS ON EH MODEL
CONCLUSION AND FUTURE WORK
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call