Abstract

The explosive growth of machine-type communications (MTC) devices poses critical challenges to the existing cellular networks. Therefore, how to support massive MTC devices with limited resources is an urgent problem to be solved. Bursty traffic is an important characteristic of MTC devices, which makes it difficult for agents to learn useful experience and has a negative impact on model convergence. However, most existing reinforcement learning-based literatures assume that devices have saturate data. Towards this end, we propose two distributed Q-learning aided uplink grant-free non-orthogonal multiple access (NOMA) schemes (including all-devices distributed Q-learning (ADDQ) scheme and portion-devices distributed Q-learning (PDDQ) scheme) to maximize the number of accessible devices, where the bursty traffic of massive MTC devices is carefully considered. In order to reduce the dimension of scheduling space and mitigate the impact of bursty traffic, the idea of grouping devices as well as transmission resources and the intermittent learning mode are adopted in our schemes. Extensive numerical results demonstrate the advantages of proposed schemes from multiple perspectives.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call