Non-cooperative space object pose estimation is a key technique for spatial on-orbit servicing, where pose estimation algorithms based on low-quality, low-power monocular sensors provide a practical solution for spaceborne applications. The current pose estimation methods for non-cooperative space objects using monocular vision generally consist of three stages: object detection, landmark regression, and perspective-n-point (PnP) solver. However, there are drawbacks, such as low detection efficiency and the need for prior knowledge. To solve the above problems, an end-to-end non-cooperative space object pose estimation learning algorithm based on dual-channel transformer is proposed, a feature extraction backbone network based on EfficientNet is established, and two pose estimation subnetworks based on transformer are also established. A quaternion SoftMax-like activation function is designed to improve the precision of orientation error estimating. The method only uses RGB images, eliminating the need for a CAD model of the satellite, and simplifying the detection process by using an end-to-end network to directly detect satellite pose information. Experiments are carried out on the SPEED dataset provided by the European Space Agency (ESA). The results show that the proposed algorithm can successfully predict the satellite pose information and effectively decouple the spatial translation information and orientation information, which significantly improves the recognition efficiency compared with other methods.
Read full abstract