The defense against near-Earth asteroids (NEAs) using kinetic impact is faced with various challenges, including limited maneuverability of the impactor, inaccurate dynamic models, poor observability of relative navigation, and control execution errors. To address these challenges, an integrated robust navigation and guidance method for the kinetic impact of NEAs based on deep reinforcement learning (DRL) is proposed in this paper, which can directly map angle measurements by a monocular camera to guidance maneuvers. Firstly, a discrete partial observable Markov decision process (POMDP) is modeled for the integrated navigation and guidance problem of NEA interception. To address the situation where impactors are usually only equipped with aiming cameras and only the line of sight (LOS) angle is measured, past and current LOS measurements are concatenated into a one-dimensional state observation vector to directly introduce the memory of historical state information. A shaping reward function design based on potential energy has also been proposed to solve the problem of sparse reward as the main goal. Subsequently, proximal policy optimization (PPO) is used to solve the established POMDP model, obtaining an integrated navigation and guidance policy that directly maps from the original output of the navigation sensor to the guidance command. The potentially hazard asteroid (PHA) Bennu is used as the target and a typical kinetic impact defense scenario is designed to simulate and verify the model and method proposed in this paper, considering the influence of multiple factors. Numerical simulation results show that the proposed method achieves an interception accuracy of 203.58 m (mean value). The proposed method abandons the traditional separate design of navigation and guidance algorithms, and its robustness has been tested and verified in a wide range of uncertain environments.
Read full abstract