Exploring Spatio–Temporal Graph Convolution for Video-Based Human–Object Interaction Recognition

Ning Wang,Hongsheng Li,Liang Zhang,Xia Zhao,Mingtao Feng,Lan Ni,Peiyi Shen,Guangming Zhu,Lin Mei

doi:10.1109/tcsvt.2023.3259430

Abstract

Video-based human-object interaction recognition is a challenging task since the state of objects as well as their correlations change constantly in the video. Existing methods mainly use 3DCNN or use separate components ( <italic xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">e.g</i> ., GCN + RNN) to model the spatial correlation or the temporal correlation respectively, but ignore modeling spatio-temporal correlations simultaneously and long-term temporal dynamics of objects. In this paper, we propose a novel model, named Spatio-Temporal Interaction Graph Parsing Networks (STIGPN), for human-object interaction recognition in videos. STIGPN captures both spatial and temporal correlations simultaneously and thus can capture intra-frame and inter-frame dependencies efficiently and effectively. To model long-term temporal dynamics of objects, we introduce spatio-temporal feature enhancement, which can improve the detection of the salient human-object interaction pairs. We explore three types of spatio-temporal graph convolutions to simultaneously capture the spatio-temporal correlations and assess their effectiveness as the basic building block of STIGPN. Extensive experiments on CAD-120, Something-Else and Charades datasets show that our proposed solution leads to competitive results compared with the state-of-the-art methods. Code for STIGPN is available at: https://github.com/NingWang2049/STIGPN2.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Exploring Spatio–Temporal Graph Convolution for Video-Based Human–Object Interaction Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology

Lead the way for us

Journal: IEEE Transactions on Circuits and Systems for Video Technology	Publication Date: Oct 1, 2023
Citations: 3

Similar Papers

Geometric Features Informed Multi-person Human-Object Interaction Recognition in Videos
Tanqiu Qiao ... Qianhui Men
-
Tanqiu Qiao, et. al.Tanqiu Qiao ... Qianhui Men
01 Jan 2021
01 Jan 2021

Human object interactions recognition based on social network analysis
Guang Yang ... Hong Man
-
Guang Yang, et. al. Guang Yang ... Hong Man
01 Oct 2013
01 Oct 2013

An Intelligent HealthCare Monitoring Framework for Daily Assistant Living
Yazeed Yasin Ghadi ... Tamara Al Shloul
Computers, Materials & Continua | VOL. 72
Yazeed Yasin Ghadi, et. al.Yazeed Yasin Ghadi ... Tamara Al Shloul
01 Jan 2021
Computers, Materials & Continua | VOL. 72

Pairwise Body-Part Attention for Recognizing Human-Object Interactions
Hao-Shu Fang ... Yu-Wing Tai
-
Hao-Shu Fang, et. al.Hao-Shu Fang ... Yu-Wing Tai
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Exploring Spatio–Temporal Graph Convolution for Video-Based Human–Object Interaction Recognition

Abstract

Talk to us

Similar Papers

More From: IEEE Transactions on Circuits and Systems for Video Technology