Abstract
A policy for six-degree-of-freedom docking maneuvers with rotating targets is developed through reinforcement learning and implemented as a feedback control law. Potential clients for satellite servicing and orbital debris objects are often rotating around a constant axis within their respective Earth orbits. In the context of such missions, reinforcement learning provides an appealing framework for robust, autonomous maneuvers in uncertain environments with low on-board computational cost. This work uses proximal policy optimization to produce a docking policy for rotating or nonrotating targets that is valid over a portion of the six-degree-of-freedom state space while striving to minimize performance and control costs. Experiments using the simulated Apollo transposition and docking maneuver with an induced spin in the lunar module exhibit the policy’s capabilities and provide a comparison with standard optimal control techniques. Furthermore, specific challenges and workarounds, as well as a discussion on the benefits and disadvantages of reinforcement learning for docking policies, are discussed to facilitate future research. As such, this work will serve as a foundation for further investigation of learning-based control laws for spacecraft proximity operations in uncertain environments.
Submitted Version (
Free)
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have