The capabilities of AUV mutual perception and localization are crucial for the development of AUV swarm systems. We propose the AUV6D model, a synthetic image-based approach to enhance inter-AUV perception through 6D pose estimation. Due to the challenge of acquiring accurate 6D pose data, a dataset of simulated underwater images with precise pose labels was generated using Unity3D. Mask-CycleGAN technology was introduced to transform these simulated images into realistic synthetic images, addressing the scarcity of available underwater data. Furthermore, the Color Intermediate Domain Mapping strategy is proposed to ensure alignment across different image styles at pixel and feature levels, enhancing the adaptability of the pose estimation model. Additionally, the Salient Keypoint Vector Voting Mechanism was developed to improve the accuracy and robustness of underwater pose estimation, enabling precise localization even in the presence of occlusions. The experimental results demonstrated that our AUV6D model achieved millimeter-level localization precision and pose estimation errors within five degrees, showing exceptional performance in complex underwater environments. Navigation experiments with two AUVs further verified the model’s reliability for mutual 6D pose estimation. This research provides substantial technical support for more complex and precise collaborative operations for AUV swarms in the future.