Abstract
The task of 3D layout estimation in an indoor scene is to predict the holistic 3D structural information of the scene from an RGB image. It is costly to obtain the ground truth 3D layout, and this issue severely restricts the learning based 3D layout estimation approaches. In this paper, we present a novel weakly supervised learning framework that is able to learn the 3D layout effectively with 2D layout segmentation mask as supervision. We employ a deep neural network to predict the plane parameters and camera intrinsic parameters in the image. Based on the predicted plane instances, the 3D layout as well as the corresponding depth map and 2D segmentation can be generated. The key objectives for learning meaningful plane parameters are the label consistency of layout segmentation and depth consistency of border pixels from adjacent planes, with which the ground truth 2D layout segmentation is able to supervise the learning of the 3D layout. We further incorporate 3D geometric reasoning and prior knowledge in the learning process to ensure that the learned 3D layout is realistic and reasonable. Experimental results show that our method can produce accurate 3D layout estimates by weakly supervised learning.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE transactions on image processing : a publication of the IEEE Signal Processing Society
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.