Abstract

Nowadays, many deep learning applications benefit from multi-task learning with several related objectives. In autonomous driving scenarios, being able to infer motion and spatial information accurately is essential for scene understanding. In this paper, we propose a unified framework for unsupervised joint learning of optical flow, depth and camera pose. Specifically, we use a feature refinement module to adaptively discriminate and recalibrate feature, which can integrate local features with their global dependencies to capture rich contextual relationships. Given a monocular video, our network firstly calculates rigid optical flow by estimating depth and camera pose. Then, we design an auxiliary flow network for inferring non-rigid flow field. In addition, a forward–backward consistency check is adopted for occlusion reasoning. Extensive analyses on KITTI dataset are conducted to verify the effectiveness of our proposed approach. The experimental results show that our proposed network can produce sharper, clearer and detailed depth and flow maps. In addition, our network achieves potential performance compared to the recent state-of-the-art approaches.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.