This paper presents a feedback guidance algorithm for proximity operation in cislunar environment based on actor-critic reinforcement learning. The algorithm is lightweight, closed-loop, and capable of taking path constraints into account. The method relies on reinforcement learning to make the well known Zero-Effort-Miss/Zero-Effort-Velocity guidance state dependent and allow for path constraints to be directly embedded. The algorithm is tested in the circular restricted three-body problem (CRTBP) framework for Near Rectilinear Orbits (NRO) in the Earth-Moon system. It shows promising results in terminal guidance error and satisfies path constraints in constraint scenarios comprising spherical constraints and keep-out-spheres with approach corridors. Furthermore, this approach indicates that reinforcement learning can be effectively used to solve constrained relative spacecraft guidance problems in complex environments and thus can be effective for autonomous relative motion operations in the Earth-Moon dynamical environment.
Read full abstract