Abstract

We study the problem of 3D object detection from RGB-D images so as to achieve localization (i.e., producing a bounding box around the object) and classification (i.e., determining the object category) simultaneously. Its challenges arise from high intra-class variability, illumination change, background clutter and occlusion. To solve this problem, we propose a novel solution that integrates the 2D information (RGB images), the 3D information (RGB-D images) and the object/scene context information together, and call it the Context-Assisted 3D (C3D) method. In the proposed C3D method, we first use a convolutional neural network (CNN) to jointly detect a 3D object in a scene and its scene category. Then, we improve the detection result furthermore with a Conditional Random Field (CRF) model that incorporates the object potential, the scene potential, the scene/object context, the object/object context, and the room geometry. Extensive experiments are conducted to demonstrate that the proposed C3D method achieves the state-of-the-art performance for 3D object detection against the SUN RGB-D benchmark dataset.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call