Abstract

Hough voting based on PointNet++ [1] is effective against 3D object detection, which has been verified by VoteNet [2], H3DNet [3], etc. However, we find there is still room for improvements in two aspects. The first is that most existing methods ignores the particular significance of different format inputs and geometric primitives for predicting object proposals. The second is that the feature extracted by PointNet++ overlooks contextual information about each object. In this paper, to tackle the above issues, we introduce MCGNet to learn multi-level geometric-aware and scale-aware contextual information for 3D object detection. Specifically, our network mainly consists of the baseline module based on H3DNet, geometric-aware module, and context-aware module. The baseline module feeding with four-types inputs (Point, Edge, Surface, and Line) concentrates on extracting diversified geometric primitives, i.e., BB centers, BB face centers, and BB edge centers. The geometric-aware module is proposed to learn the different contributions among the four-types feature maps and the three geometric primitives. The context-aware module aims to establish long-range dependencies features for either four-types feature maps or three geometric primitives. Extensive experiments on two large datasets with real 3D scans, SUN RGB-D and ScanNet datasets, demonstrate that our method is effective against 3D object detection.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.