Abstract

In this paper, we consider an intelligent reflecting surface (IRS)-assisted cognitive radio system and maximize the secondary user (SU) rate by jointly optimizing the transmit power of secondary transmitter (ST) and the IRS’s reflect beamforming, subject to the constraints of the minimum required signal-to-interference-plus-noise ratio at the primary receiver, the ST’s maximum transmit power, and the unit modulus of the IRS reflect beamforming vector. This joint optimization problem can be solved suboptimally by the non-convex optimization techniques, which however usually require complicated mathematical transformations and are computationally intensive. To address this challenge, we propose an algorithm based on the deep deterministic policy gradient (DDPG) method. To achieve a higher learning efficiency and a lower reward variance, we propose another algorithm based on the soft actor-critic (SAC) method. In these proposed algorithms, a reward impact adjustment approach is proposed to improve their learning efficiency and stability. Simulation results show that the two proposed algorithms can achieve comparable SU rate performance with much shorter running time, as compared to the existing non-convex optimization-based benchmark algorithm, and that the proposed SAC-based algorithm learns faster and achieves a higher average reward with lower variance, as compared to the proposed DDPG-based algorithm.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call