Abstract

Learning-based camera ego-motion estimation has attracted increasing attention and has made impressive improvements. However, the accuracy of unsupervised paradigm is still limited, especially in complex dynamic environment. In this paper, we propose a rigid-aware self-supervised generative adversarial network (GAN) for camera ego-motion estimation, which can effectively learn the rigidity of the scene and improve the accuracy of ego-motion estimation by combining the pixel- and the structure-level perception. Specifically, a rigid-aware generator is first designed for joint unsupervised learning of optical flow, stereo depth and camera pose from two consecutive frames. Then, an iterative pose refinement strategy with rigidity learning is presented to reduce the impact of moving objects in scenes. To overcome the limitation of the purely pixel-wised photometric methods, a rigidity mask embedded discriminator is attached to perceive structural distortion artifacts in synthesized fake images, which encourages the generator to learn additional structure-level information to improve the accuracy of pose estimation. Experiments on the benchmark datasets show that our model achieves the state-of-the-art performance in terms of both RPE and ATE compared to recent GAN-based methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call