Abstract

Spacecraft path-planning approaches that are capable of producing not only autonomous but also generalized solutions promise to open new modalities for robotic exploration and servicing in orbit environments characterized by epistemic uncertainty. Our hypothesis is that, in face of large systemic uncertainties, learning control policies from human demonstrations may offer a pathway to spacecraft guidance solutions that are more generalizable and complementary of baseline-dependent methods. In this context, the performance discrepancy between spacecraft path-planning policies trained from human demonstrations and optimal baseline solutions has not been characterized. We define a low-thrust, minimum-time transfer problem with predefined boundary constraints, across a wide range of binary asteroid systems, which form a “sandbox” environment for complex, highly variable, and poorly known orbit environments. Trajectories generated using optimal control theory (both indirect and direct methods) and human demonstrations (from a gamified version of the baseline environment) are collected. We test different implementations of behavioral cloning to analyze both feed forward and stateless long short-term memory regressions. As a result, we collect initial empirical evidence of performance degradation during cloning (compared to precisely optimal behavior) and performance stability across variations of the environment parameters. Such evidence informs the selection and improvement of the behavioral cloning architecture in future studies.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call