Abstract

Advancing autonomous spacecraft proximity maneuvers and docking (PMD) is crucial for enhancing the efficiency and safety of inter-satellite services. One primary challenge in PMD is the accurate a priori definition of the system model, often complicated by inherent uncertainties in the system modeling and observational data. To address this challenge, we propose a novel Lyapunov Bayesian actor-critic reinforcement learning algorithm that guarantees the stability of the control policy under uncertainty. The PMD task is formulated as a Markov decision process that involves the relative dynamic model, the docking cone, and the cost function. By applying Lyapunov theory, we reformulate temporal difference learning as a constrained Gaussian process regression, enabling the state-value function to act as a Lyapunov function. Additionally, the proposed Bayesian quadrature policy optimization method analytically computes policy gradients, effectively addressing stability constraints while accommodating informational uncertainties in the PMD task. Experimental validation on a spacecraft air-bearing testbed demonstrates the significant and promising performance of the proposed algorithm.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.