Abstract

When a wing-in-ground craft (WIG) adjusts its flying altitude, overshooting behavior may occur, which weakens the safety and stealth ability. In previous studies on path following, cross-track error was used in company with other indicators to indirectly suppress overshoot. This paper proposes a method for direct and gradual suppression of the overshoot via deep reinforcement learning (DRL), which iterates the reward function by introducing a partial one based on the current overshoot magnitude. Each time the overshoot is obtained by DRL, a function about this overshoot is added to the reward function for retraining. The function is defined as a type of cross-track error within a range of the current overshoot magnitude to the target altitude, and it counts the partial reward before the WIG gets the worse overshoot during training. The methodological feasibility is proved by mathematical reasoning, and an example of a virtual WIG changing the altitude is taken to validate the method. Assuming that the added partial function is in a basic 1-order fractional form of cross-track error and multiplied by a factor, the implementation of iterative reward shaping decreases overshoot to a minimal level, with the overshoot down to over 99.8% when compared to the initial one. Moreover, when introducing the partial reward function in the first iteration, influence of the factor on overshoot is analyzed. For a WIG’s adjustment of altitude, the method can monotonically reduce overshoot within tolerance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call