Abstract

In this article, a data-driven multiple radar system (MRS) resource allocation method for target tracking is developed based on deep reinforcement learning. The goal is to achieve the given tracking accuracy requirement with minimum long-short term power consumption by radar selection and MRS power allocation. In theory, thanks to the existence of the given tracking accuracy requirement, this problem can be modeled as a constrained Markov decision process (MDP). For this problem, a constrained deep reinforcement learning (DRL) is introduced based on deep deterministic policy gradient. Specifically, by relaxing the original constrained MDP problem to unconstrained MDP problem with Lagrangian relaxation procedure, the tracking accuracy requirement is introduced into the derivation of policy gradient of actor network in deep deterministic policy gradient, which makes the tracking performance with the resource allocation policy learned by DRL able to meet the given tracking requirements. Meanwhile, considering the limited radar and power resource of MRS, three output layers of the actor network is redesigned for determining the actions of radar selection and power allocation. In this way, the assignment and transmit power of each radar of MRS can be given in real time at each tracking interval. Simulation results have shown the effectiveness of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call