Abstract
As an important visual understanding task, scene graph generation has been drawing widespread attention and could boost a broad range of downstream vision applications. Traditional scene graph generation methods based on different context refinements are trained with probabilistic chain rule, which treats objects and relationships as independent entities. Despite their surprisingly great progress, such a plain formulation unconsciously ignores the latent geometric structure of entities and relationships. To address this issue, we move beyond the traditional real-valued representations and use <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">Quat</u> ernion <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">R</u> elation <underline xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">E</u> mbedding (QuatRE) to generate scene graphs with more expressive hypercomplex representations. More specifically, we introduce the concept of quaternion representations, hyper-complex valued with three imaginary components for objects entities, then formulate the relation triplets with Hamilton product. Benefiting from explicitly modeling the latent inter-dependencies among all imaginary components and strong expressive capacity, our proposed QuatRE method could better capture the interactions between entities. More importantly, our novel QuatRE method can be treated as a plug-in and well generalized into other methods for performance improvement as it involves no additional layers. Finally, extensive comparisons of our proposed method against the state-of-the-art methods on two large-scale and widely-used datasets, i.e. Visual Genome and Open Images, demonstrated our superiority and generalization capability on various metrics for biased or unbiased inference.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.