Segmenting Beyond the Bounding Box for Instance Segmentation

Xiaoliang Zhang,Linfeng Xu,Zichen Song,Hongliang Li,Fanman Meng

doi:10.1109/tcsvt.2021.3063377

Xiaoliang Zhang, Linfeng Xu + Show 3 more

https://doi.org/10.1109/tcsvt.2021.3063377

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

Instance segmentation needs to locate all instances in an image correctly and segment each instance precisely. Currently, the most dominant methods for instance segmentation take object detection as a pre-task. However, they rely on the accuracy of object detection incredibly. If the pre-task cannot predict an accurate bounding box, the performance of instance segmentation will degenerate. In this paper, we present a novel method for instance segmentation to solve this problem, which is called <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S egmenting <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B eyond the <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B ounding <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">B ox ( <bold xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">S3B-Net ). Our S3B-Net designs a sub-network to help instance segmentation methods based on object detection to segment the part of an instance beyond the bounding box. Specifically, the sub-network first predicts a two-dimensional pixel embedding for each pixel. Then, the Gaussian function is employed to calculate a pixel’s probability belongs to a corresponding instance according to the two-dimensional pixel embedding. Finally, the output of the sub-network combines with the output of instance segmentation based on object detection to generate a more precise instance mask. Our sub-network can easily extend on the existing instance segmentation method based on object detection to segment instance beyond the bounding box. We do our experiments on dominant instance segmentation datasets, such as the COCO dataset and Cityscapes dataset. The results show that our method can achieve 6.8 points gain compared with the baseline Mask R-CNN with ResNet-50-FPN in Cityscapes datasets, and 1.7 points gain with ResNet-101-FPN-DCN in COCO datasets. Our S3B-Net outperforms the previous state-of-the-art instance segmentation method, which proves our method is competitive. The source code of our method will be made available.

Full Text