Abstract

In this paper, we address the click-based interactive segmentation task with a novel transformer network. Transformer-based approaches show promising results in various computer vision tasks. However, all modern interactive segmentation methods are still based on convolutional networks. We propose a transformer network for interactive segmentation and explore three different ways to feed click information into neural networks. Through extensive evaluation, we show that our model trained on a combination of COCO and LVIS sets a new click-based state-of-the-art on GrabCut, Berkeley, SBD, DAVIS, and Pascal VOC in terms of NoC (Number of Clicks) and mIoU. The source code is available at https://github.com/SamsungLabs/saic-is.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call