Scene Graph Inference via Multi-Scale Context Modeling

Ning Xu,Mohan Kankanhalli,Yongkang Wong,Yuting Su,Weizhi Nie,An-An Liu

doi:10.1109/tcsvt.2020.2990989

Abstract

The scene graph generated for an image structurally represents its object interactions and it substantially aids image scene understanding. To the best of our knowledge, most current works on scene graph generation chiefly focus on pairwise object regions for object and relation inference while ignoring the global visual context outside of these regions. Guided by the intuition that object/relation inference can benefit from the visual context within an image, this paper proposes a multi-scale context modeling method, which can jointly discover and integrate the complementary object-centric and region-centric context for scene graph inference. While both the object-centric and region-centric contexts are separately modeled by their individual modules, a bi-directional message propagation strategy is designed to mutually reinforce the context modeling. A context-fused inference is then proposed to integrate the multi-scale context to guide scene graph inference. Extensive experiments establish that this method can achieve competitive performance compared to the state-of-the-art methods on three benchmarks. Additional ablation studies further validate its effectiveness. Code has been made available at: https://github.com/ningxu1990/MSCM .

Full Text