Abstract

We investigate the problem of dynamic spectrum anti-jamming access against intelligent jammer using game theory and opponent modeling. Previous work has formulated the interaction between the user and the intelligent jammer as an adversarial game, and aimed to find the Nash Equilibrium (NE). However, sticking to NE will lead to overcautious behaviors and can’t achieve the best performance while the jammer is sub-optimal. Thus, this letter tries to exploit the adaptive jammer and find the Best Response (BR) rather than NE. We propose the minimax deep Q network (DQN) to approximate anti-jamming utility while applying imitation learning to reason about the jammer’s policy. Based on the utility and imitation jamming policy, the user is able to find the policy beyond equilibrium solutions and enhance anti-jamming performance. Numerical results demonstrate that our scheme achieves a 30% improvement in success access rate over the NE-based and single-user DRL.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.