Abstract

The goal of scene graph generation (SGG) is to classify objects and their pair-wise relationships in a visual scene. Object occlusion is a critical challenge when generating scene graphs in complex scenes. However, this issue has rarely been explored in recent works. Accordingly, in this paper, we propose a subset matching network (SM-Net) that handles the above problem. First, we decompose SGG into two types of subset matching problems: node subset matching and edge subset matching. Each node/edge subset handles the occlusion between one node/edge pair, thereby reducing the difficulty of SGG in a “divide and conquer” manner. Second, we introduce a node subset prediction module that utilizes a subset-based message passing module to refine the node subset representation and a matching loss to supervise node subset prediction. Third, we propose an edge subset prediction module that applies a feature selection-based fusion function to obtain edge subset features and a matching loss to supervise edge subset predictions. Experiments on three popular datasets show that our model achieves state-of-the-art performance. The code of SM-Net will be released.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call