Segmenting Objects From Relational Visual Data.

Xiankai Lu,Luc Van Gool,David J Crandall,Jianbing Shen,Wenguan Wang

doi:10.1109/tpami.2021.3115815

Xiankai Lu, Luc Van Gool + Show 3 more

Open Access

PDF Available

https://doi.org/10.1109/tpami.2021.3115815

Copy DOI

Export

Save

Cite

Abstract
Full-Text PDF
Similar Papers

Abstract

Listen

In this article, we model a set of pixelwise object segmentation tasks - automatic video segmentation (AVS), image co-segmentation (ICS) and few-shot semantic segmentation (FSS) - in a unified view of segmenting objects from relational visual data. To this end, we propose an attentive graph neural network (AGNN) that addresses these tasks in a holistic fashion, by formulating them as a process of iterative information fusion over data graphs. It builds a fully-connected graph to efficiently represent visual data as nodes and relations between data instances as edges. The underlying relations are described by a differentiable attention mechanism, which thoroughly examines fine-grained semantic similarities between all the possible location pairs in two data instances. Through parametric message passing, AGNN is able to capture knowledge from the relational visual data, enabling more accurate object discovery and segmentation. Experiments show that AGNN can automatically highlight primary foreground objects from video sequences (i.e., automatic video segmentation), and extract common objects from noisy collections of semantically related images (i.e., image co-segmentation). AGNN can even generalize segment new categories with little annotated data (i.e., few-shot semantic segmentation). Taken together, our results demonstrate that AGNN provides a powerful tool that is applicable to a wide range of pixel-wise object pattern understanding tasks with relational visual data. Our algorithm implementations have been made publicly available at https://github.com/carrierlxk/AGNN.

Full Text