Abstract

Deep learning on point clouds drives 3D object detection. Despite rapid progress, point-based methods still suffer from the problems such as incompletion and occlusion, which are caused by the material properties of objects and cluttered scenes. These difficult targets increase the difficulty of identification or even lead to misidentification, severely weakening the performance of point-based methods on 3D object detection. To alleviate the above problems, we propose the Objformer to boost point-based 3D object detection via instance-wise interaction. We design an instance feature encoder to encode clean instance features, which contain key geometric priors and holistic semantic information. Further, an instance interaction module is devised to aggregate the complementary features across instances with label-guided interaction, boosting the performance of the 3D object detection. Experiments show that Objformer outperforms previous point-based state-of-the-arts on two popular benchmarks, ScanNet V2 and SUN RGB-D. Especially, our single-modal Objformer even outperforms the competing advanced multi-modal fusion method on both SUN RGB-D and ScanNet V2.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call