Abstract
Unsupervised learning methods have achieved remarkable performance in monocular depth estimation and camera pose, which mostly solve the multi-task learning problem by using their inner geometry consistency as the self-supervision signal. While most existing approaches mostly adopt the generative model to obtain the depth map prediction, so in the resolution of depth map there is room for improvement. To this end, we present our unsupervised learning architecture based on adversarial learning model, which is used for unsupervised learning of high-resolution single view depth and camera pose. Specifically, we present a multi-scale deep convolutional Generative Adversarial Network (GAN) based learning system, which consists of three networks (pose estimation network PCNN, Generator-D and Discriminator-D for depth map prediction). Furthermore, in order to generate high-resolution depth map, we propose a multi-scale GAN model (MSGAN) to decompose the hard high-quality image generation problem into more manageable sub-problems through a coarse-to-fine process. Then, we modify the overall generation architecture of GAN model by changing the down-sampling and up-sampling components to improve the quality and accuracy of the depth map prediction. Finally, in order to improve the rate of convergence, we use the Least Square Error to increase the penalty for outliers. Detailed quantitative and qualitative evaluations of the proposed framework on the KITTI dataset show that the proposed method provides better results for both pose estimation and depth recovery.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Intelligent Transportation Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.