Abstract

Reinforcement learning (RL) has the advantage of interaction with an environment over time, which is helpful in cognitive jamming research, especially in an electronic warfare-type scenario, in which the communication parameters and jamming effect are unknown to a jammer. In this paper, an algorithm for a jamming strategy using orthogonal matching pursuit (OMP) and multi-armed bandit (MAB) is proposed. We construct a dictionary in which each atom represents a symbol error rate (SER) curve and can be obtained with known noise distribution and deterministic parameters. By reconnoitering, the jammer counts acknowledge/not acknowledge (ACK/NACK) frames to calculate the SER, which is also regarded as samples that are sampled from the real SER curve using an MAB. When we obtain the sampled sequence and the constructed dictionary, the OMP algorithm is used to search and locate atoms and its corresponding coefficients. With the searching results, the jammer can construct an SER curve that is similar to the real SER curve. The experimental results demonstrate that the proposed algorithm can learn an optimal jamming strategy with three interactions, which converges substantially faster than the state of the art.

Highlights

  • Wireless communication has extensive utilization in civilian and military domains with the advantage of convenience [1,2,3]

  • We investigate the ability of an agent to learn an efficient jamming strategy with sparse representation and Reinforcement learning (RL)

  • RL does not need prior information and is convenient for implementation, it has the disadvantage of slow convergence, which is the limitation of its application

Read more

Summary

Introduction

Wireless communication has extensive utilization in civilian and military domains with the advantage of convenience [1,2,3]. Three categories of jamming methods can be presented as follows: (1) Reconnaissance, evaluation, and jamming—The jammer collects the required information, such as modulation scheme, transmission power, and communication protocols, and takes some targeted actions, such as denial of service (DOS) attack, eavesdropping attack, and correlation attack or hybrid attack. (2) Game theory—When both jammer and communicators can recognize the existence of each other, actions such as jamming or anti-jamming are performed to conquer each other. The major disadvantage of these methods is that it assumes that the jammer has accurate information about environmental factors and receiver actions. Game theory is a dynamic process between the jammer and the transmitter-receiver pairs; it can build a Nash equilibrium between both sides [11], but the jammer needs to employ an efficient jamming strategy, which is the purpose of this paper. RL does not need prior information and is convenient for implementation, it has the disadvantage of slow convergence, which is the limitation of its application

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call