Abstract

A scene graph is a structural representation of a scene comprising the objects as nodes and relationships between any two objects as edges. The scene graph is widely adopted in high-level vision language and reasoning applications. Therefore, scene graph generation has been a popular topic in recent years. However, it is limited by bias due to the long-tailed distribution among the relationships. Scene graph generators prefer to predict the head predicates, which are ambiguous and less precise. It makes the scene graph convey less information and degenerate into the stacking of objects, restricting other applications from reasoning on the graph. To make the generator predict more diverse relationships and provide a precise scene graph, we propose an additional biased predictor (ABP)-assisted balanced learning method. This method introduces an extra relationship prediction branch that is especially affected by the bias to make the generator pay more attention to the tail predicates rather than the head ones. Compared to the scene graph generator that predicts relationships between object pairs, the biased branch predicts the relationships without being assigned a certain object pair of interest, which is more concise. To train this biased branch, the region-level relationship annotation is constructed using the instance-level relationship annotation automatically. Extensive experiments on popular datasets, i.e., Visual Genome, VRD, and OpenImages, show that the ABP is effective on different scene graph generators. Besides, it makes the generator predict more diverse and accurate relationships and provides a more balanced and practical scene graph.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call