Abstract

As a candidate radio access technique for 5G, Non- Orthogonal Multiple Access (NOMA) has become an important research topic. Radio Frequency (RF) jamming attack can reduce the communication efficiency in NOMA system. Moreover, the jammer equipped Reinforcement Learning (RL) algorithm will be more destructive. On the other hand, the base station (BS) can implement RL to counter the jamming attack. Thus, the whole system evolves to a multi-agent RL system. The interaction between agents results in a highly dynamic environment and the equilibrium state of the system cannot be intuitively predicted. In the past few years, based on Evolutionary Game Theory (EGT), numbers of researchers have developed useful tools to study the multi-agent RL system in detail. The EGT tools give us insight into the equilibrium of the system and make it possible to compare the performance of different RL algorithms. In this paper, we investigate the anti-jamming problem in the NOMA system where both the base station and the jammer equip RL algorithm. We establish the two-player game and demonstrate the existence and uniqueness of equilibrium. Three RL algorithms and their learning dynamics are introduced, which are Q-learning, Lenient Frequency adjusted Q-learning and Regret Minimization. In experiments, the simulation result shows consistency to the theoretical result given by EGT. Regret Minimization outperforms the other two algorithms in term of average reward and converging rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call