This paper studies a fixed-wing unmanned aerial vehicle (UAV) assisted mobile relaying network (FUAVMRN), where a fixed-wing UAV employs an out-band full-duplex relaying fashion to serve a ground source-destination pair. It is confirmed that for a FUAVMRN, straight path is not suitable for the case that a huge amount of data need to be delivered, while circular path may lead to low throughput if the distance of ground source-destination pair is large. Thus, a running-track path (RTP) design problem is investigated for the FUAVMRN with the goal of energy minimization. By dividing an RTP into two straight and two semicircular paths, the total energy consumption of the UAV and the total amount of data transferred from the ground source to the ground destination via the UAV relay are calculated. According to the framework of Deep Reinforcement Learning and taking the UAV's roll-angle limit into consideration, the RTP design problem is formulated as a Markov Decision Process problem, giving the state and action spaces in addition to the policy and reward functions. In order for the UAV relay to obtain the control policy, Deep Deterministic Policy Gradient (DDPG) is used to solve the path design problem, leading to a DDPG based algorithm for the RTP design. Computer simulations are performed and the results show that the DDPG based algorithm always converges when the number of training iterations is around 500, and compared with the circular and straight paths, the proposed RTP design can save at least 12.13 % of energy and 65.93 % of flight time when the ground source and the ground destination are located 2000 m apart and need to transfer 5000bit/Hz of data. Moreover, it is more practical and efficient in terms of energy saving compared with the Deep Q Network based design.
Read full abstract