In this work, we consider an Unmanned Aerial Vehicle (UAV)-aided covert transmission network, which adopts the uplink transmission of Communication Nodes (CNs) as a cover to facilitate covert transmission to a Primary Communication Node (PCN). Specifically, all nodes transmit to the UAV exploiting uplink non-Orthogonal Multiple Access (NOMA), while the UAV performs covert transmission to the PCN at the same frequency. To minimize the average age of covert information, we formulate a joint optimization problem of UAV trajectory and power allocation designing subject to multi-dimensional constraints including covertness demand, communication quality requirement, maximum flying speed, and the maximum available resources. To address this problem, we embed Signomial Programming (SP) into Deep Reinforcement Learning (DRL) and propose a DRL framework capable of handling the constrained Markov decision processes, named SP embedded Soft Actor-Critic (SSAC). By adopting SSAC, we achieve the joint optimization of UAV trajectory and power allocation. Our simulations show the optimized UAV trajectory and verify the superiority of SSAC compared with various existing baseline schemes. The results of this study suggest that by maintaining appropriate distances from both the PCN and CNs, one can effectively enhance the performance of covert communication by reducing the detection probability of the CNs.
Read full abstract