Abstract

Benefiting from the flexibility and low operational cost, dispatching unmanned aerial vehicles (UAVs) to collect measurements is promising in spectrum cartography (SC). The main goal is to optimize the trajectory of an UAV to seek the most informative measurement under the environment with dynamic emitters. In this letter, we formulate a Markov Decision Process to find the optimal flight trajectory of an UAV that maximizes the accuracy of SC and minimizes energy consumption. However, due to the unavailable instantaneous feedback about the accuracy of SC, the existing methods are unable to work efficiently with sparse feedback. To tackle those issues, a Proximal Policy Optimization (PPO)-based algorithm is proposed to approach the optimal navigation policy for UAV by training with the delay interactive information at the base station. Moreover, A backtracking advantage function is further constructed to cope with sparse feedback in real-world scenario, which can avoid converging to local solutions. Extensive simulation results demonstrate that our proposed algorithm can significantly increase the accuracy of SC while reducing energy consumption.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.