MCGNet: Multi-Level Context-aware and Geometric-aware Network for 3D Object Detection

Keng Chen,Ju Dai,Feng Zhou,Pei Shen,Fengquan Zhang,Xingquan Cai

doi:10.1109/icip46576.2022.9897465

Abstract

Hough voting based on PointNet++ [1] is effective against 3D object detection, which has been verified by VoteNet [2], H3DNet [3], etc. However, we find there is still room for improvements in two aspects. The first is that most existing methods ignores the particular significance of different format inputs and geometric primitives for predicting object proposals. The second is that the feature extracted by PointNet++ overlooks contextual information about each object. In this paper, to tackle the above issues, we introduce MCGNet to learn multi-level geometric-aware and scale-aware contextual information for 3D object detection. Specifically, our network mainly consists of the baseline module based on H3DNet, geometric-aware module, and context-aware module. The baseline module feeding with four-types inputs (Point, Edge, Surface, and Line) concentrates on extracting diversified geometric primitives, i.e., BB centers, BB face centers, and BB edge centers. The geometric-aware module is proposed to learn the different contributions among the four-types feature maps and the three geometric primitives. The context-aware module aims to establish long-range dependencies features for either four-types feature maps or three geometric primitives. Extensive experiments on two large datasets with real 3D scans, SUN RGB-D and ScanNet datasets, demonstrate that our method is effective against 3D object detection.

Full Text