Abstract
Point-based indoor 3D object detection has received increasing attention with the large demand for augmented reality, autonomous driving, and robot technology in the industry. However, the detection precision suffers from inputs with semantic ambiguity, i.e., shape symmetries, occlusion, and texture missing, which would lead that different objects appearing similar from different viewpoints and then confusing the detection model. Typical point-based detectors relieve this problem via learning proposal representations with both geometric and semantic information, while the entangled representation may cause a reduction in both semantic and spatial discrimination. In this paper, we focus on alleviating the confusion from entanglement and then enhancing the proposal representation by considering the proposal’s semantics and the context in one scene. A semantic-context graph network (SCGNet) is proposed, which mainly includes two modules: a category-aware proposal recoding module (CAPR) and a proposal context aggregation module (PCAg). To produce semantically clear features from entanglement representation, the CAPR module learns a high-level semantic embedding for each category to extract discriminative semantic clues. In view of further enhancing the proposal representation and leveraging the semantic clues, the PCAg module builds a graph to mine the most relevant context in the scene. With few bells and whistles, the SCGNet achieves SOTA performance and obtains consistent gains when applying to different backbones (0.9% ~ 2.4% on ScanNet V2 and 1.6% ~ 2.2% on SUN RGB-D for mAP@0.25). Code is available at https://github.com/dsw-jlu-rgzn/SCGNet.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Circuits and Systems for Video Technology
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.