Abstract

When aggregating local information from neighbors, prevailing 3D instance segmentation backbones only leverage 3D coordinates to find neighboring points without identifying whether these points are from the same object as the query point, which causes the model to gather excessive noisy features. Besides, traditional backbones fail to fully utilize multi-resolution information. Therefore, previous methods have difficulty in segmenting targets in cluttered scenes. To tackle these issues, we propose Instance-Augmented Net (IAN). The keys to our approach are Instance-Augmented Block (IAB), Instance-Augmented Upsampler (IAU), and Attentive Fusion (AF). In IAB, for each foreground point, we leverage its instance information to filter out noisy neighbors from other objects. We also propose IAU to apply this instance-augmented strategy to the upsampling process. Furthermore, to retain comprehensive information, we upsample multi-resolution feature maps and adopt attention generated by AF to fuse them. Notably, by encoding neighborhood information, AF can generate attention at point-level adaptively. Moreover, to further test the generality of models, we present Clutter and Occlusion (CAO), a new 3D instance segmentation dataset tailored for robotic grasping tasks. Extensive experiments on S3DIS, ScanNet and CAO show the effectiveness of our IAN.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call