Abstract
3D object detection in RGB-D images is a vast growing research area in computer vision. In this paper, we study the problems of amodal 3D object detection in RGB-D images and present an efficient 3D object detection system that can predict object location, size, and orientation. Unlike existing methods that either uses multistage point cloud processing or pre-computed segmentation mask to generate the 3D bounding boxes, we only leverage 2D region proposals for this task. Given a pair of color and depth image as input, we first predict 2D region proposals from the designed multimodal fusion region proposal networks and then we propose an efficient method to generate 3D bounding boxes from those region proposals by scaling down the 2D bounding boxes with a scale factor and project it to 3D space. We evaluate our system on challenging NYUv2 and SUN RGB-D dataset and compare with the state-of-the-art detection methods. The experimental results show that our method outperforms the state-of-the-art by a remarkable margin with faster detection time. We achieve the best results on the NYUv2 dataset on a 19-class object detection task while performing comparably faster detection performances on the SUN RGB-D dataset on a 10-class object detection task.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.