Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

Hangjie Yuan,Mang Wang,Dong Ni,Liangpeng Xu

doi:10.1609/aaai.v36i3.20229

Abstract

Human-Object Interaction (HOI) detection is an essential task to understand human-centric images from a fine-grained perspective. Although end-to-end HOI detection models thrive, their paradigm of parallel human/object detection and verb class prediction loses two-stage methods' merit: object-guided hierarchy. The object in one HOI triplet gives direct clues to the verb to be predicted. In this paper, we aim to boost end-to-end models with object-guided statistical priors. Specifically, We propose to utilize a Verb Semantic Model (VSM) and use semantic aggregation to profit from this object-guided hierarchy. Similarity KL (SKL) loss is proposed to optimize VSM to align with the HOI dataset's priors. To overcome the static semantic embedding problem, we propose to generate cross-modality-aware visual and semantic features by Cross-Modal Calibration (CMC). The above modules combined composes Object-guided Cross-modal Calibration Network (OCN). Experiments conducted on two popular HOI detection benchmarks demonstrate the significance of incorporating the statistical prior knowledge and produce state-of-the-art performances. More detailed analysis indicates proposed modules serve as a stronger verb predictor and a more superior method of utilizing prior knowledge. The codes are available at https://github.com/JacobYuan7/OCN-HOI-Benchmark.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 28, 2022
Citations: 15

Similar Papers

Human object interaction detection in paintings using multi-task learning
Maya Antoun ... Daniel Asmar
Digital Applications in Archaeology and Cultural Heritage | VOL. 34
Maya Antoun, et. al.Maya Antoun ... Daniel Asmar
24 Jul 2024
Digital Applications in Archaeology and Cultural Heritage | VOL. 34

An Optimization Model for Human-Object Interaction Detection Inspired by Multi-features
Hailan Kuang ... Jian Dong
-
Hailan Kuang, et. al.Hailan Kuang ... Jian Dong
01 Apr 2019
01 Apr 2019

Detecting Human-Object Interaction via Fabricated Compositional Learning
Zhi Hou ... Baosheng Yu
-
Zhi Hou, et. al.Zhi Hou ... Baosheng Yu
01 Jun 2021
01 Jun 2021

Exploring the synergy between textual identity and visual signals in human-object interaction
Pinzhu An ... Zhi Tan
Image and Vision Computing | VOL. 151
Pinzhu An, et. al.Pinzhu An ... Zhi Tan
02 Sep 2024
Image and Vision Computing | VOL. 151

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Detecting Human-Object Interactions with Object-Guided Cross-Modal Calibrated Semantics

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence