AbstractThis paper presents a comprehensive approach to enhancing autonomous docking maneuvers through machine visual perception and sim-to-real transfer learning. By leveraging relative vectoring techniques, we aim to replicate the human ability to execute precise docking operations. Our study focuses on autonomous aerial refueling as a use case, demonstrating significant advancements in relative navigation and object detection. We introduce a novel method for aligning digital twins using fiducial targets and motion capture data, which facilitates accurate pose estimation from real-world imagery. Additionally, we develop cost-efficient annotation automation techniques for generating high-quality You Only Look Once training data. Experimental results indicate that our transfer learning methodologies enable accurate and reliable relative vectoring in real-world conditions, achieving error margins of less than 3 cm at contact (when vehicles are approximately 4 m from the camera) and maintaining performance at over 56 fps. The research findings underscore the potential of augmented reality and scene augmentation in improving model generalization and performance, bridging the gap between simulation and real-world applications. This work lays the groundwork for deploying autonomous docking systems in complex and dynamic environments, minimizing human intervention and enhancing operational efficiency.
Read full abstract