Abstract

In modern electronic warfare, it is becoming very important to develop intelligent and adaptive radar anti-jamming methods since jammers can now launch increasingly complex and unpredictable attacks. Besides, in practice, the jamming strategy is usually unknown to the radar. To overcome the limitations caused by the lack of information about the jammer, reinforcement learning is applied to radar anti-jamming in this paper via the adaptation of frequency hopping interval. In reinforcement learning, the sequential decision problem to solve is described as a Markov Decision Process (MDP). To describe the sequential radar anti-jamming decision making process, a detailed radar anti-jamming MDP model is formulated. To balance between integration efficiency and probability of interception, a flexible adjustable tradeoff between them is devised by defining the reward function of the MDP as the weighted sum of the integration efficiency factor and the probability of interception factor. Two properties of the MDP value function are proved. These properties are used to derive the optimal frequency hopping time interval for different pulse widths under the RL framework. Simulation results show that the proposed radar anti-jamming strategy can adapt to the jamming environment well and can control its performance flexibly by adjusting the weights of integration efficiency and probability of interception.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.