Abstract

This paper considers an adversarial scenario between a legitimate eavesdropper and a suspicious communication pair. All three nodes are equipped with multiple antennas. The eavesdropper, which operates in a full-duplex model, aims to wiretap the dubious communication pair via proactive jamming. On the other hand, the suspicious transmitter, which can send artificial noise (AN) to disturb the wiretap channel, aims to guarantee secrecy. More specifically, the eavesdropper adjusts jamming power to enhance the wiretap rate, while the suspicious transmitter jointly adapts the transmit power and noise power against the eavesdropping. Considering the partial observation and complicated interactions between the eavesdropper and the suspicious pair in unknown system dynamics, we model the problem as an imperfect-information stochastic game. To approach the Nash equilibrium solution of the eavesdropping game, we develop a multi-agent reinforcement learning (MARL) algorithm, termed neural fictitious self-play with soft actor-critic (NFSP-SAC), by combining the fictitious self-play (FSP) with a deep reinforcement learning algorithm, SAC. The introduction of SAC enables FSP to handle the problems with continuous and high dimension observation and action space. The simulation results demonstrate that the power allocation policies learned by our method empirically converge to a Nash equilibrium, while the compared reinforcement learning algorithms suffer from severe fluctuations during the learning process.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call