Abstract
In this paper, we address the click-based interactive segmentation task with a novel transformer network. Transformer-based approaches show promising results in various computer vision tasks. However, all modern interactive segmentation methods are still based on convolutional networks. We propose a transformer network for interactive segmentation and explore three different ways to feed click information into neural networks. Through extensive evaluation, we show that our model trained on a combination of COCO and LVIS sets a new click-based state-of-the-art on GrabCut, Berkeley, SBD, DAVIS, and Pascal VOC in terms of NoC (Number of Clicks) and mIoU. The source code is available at https://github.com/SamsungLabs/saic-is.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have