Multi-hop graph transformer network for 3D human pose estimation

Zaedul Islam,A Ben Hamza

doi:10.1016/j.jvcir.2024.104174

Abstract

Accurate 3D human pose estimation is a challenging task due to occlusion and depth ambiguity. In this paper, we introduce a multi-hop graph transformer network designed for 2D-to-3D human pose estimation in videos by leveraging the strengths of multi-head self-attention and multi-hop graph convolutional networks with disentangled neighborhoods to capture spatio-temporal dependencies and handle long-range interactions. The proposed network architecture consists of a graph attention block composed of stacked layers of multi-head self-attention and graph convolution with learnable adjacency matrix, and a multi-hop graph convolutional block comprised of multi-hop convolutional and dilated convolutional layers. The combination of multi-head self-attention and multi-hop graph convolutional layers enables the model to capture both local and global dependencies, while the integration of dilated convolutional layers enhances the model’s ability to handle spatial details required for accurate localization of the human body joints. Extensive experiments demonstrate the effectiveness and generalization ability of our model, achieving competitive performance on benchmark datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-hop graph transformer network for 3D human pose estimation

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation

Lead the way for us

Journal: Journal of Visual Communication and Image Representation	Publication Date: May 1, 2024
Citations: 1

Similar Papers

Learning Dynamical Human-Joint Affinity for 3D Pose Estimation in Videos.
Junhao Zhang ... Yali Wang
IEEE Transactions on Image Processing | VOL. 30
Junhao Zhang, et. al.Junhao Zhang ... Yali Wang
01 Jan 2020
IEEE Transactions on Image Processing | VOL. 30

Towards Accurate Human Pose Estimation in Videos of Crowded Scenes
Shuning Chang ... Yupeng Chen
-
Shuning Chang, et. al.Shuning Chang ... Yupeng Chen
12 Oct 2020
12 Oct 2020

3D Human Pose Estimation with Spatial and Temporal Transformers
Ce Zheng ... Sijie Zhu
-
Ce Zheng, et. al.Ce Zheng ... Sijie Zhu
01 Oct 2021
01 Oct 2021

Motion-Aware Heatmap Regression for Human Pose Estimation in Videos
Inpyo Song ... Jongmin Lee
-
Inpyo Song, et. al.Inpyo Song ... Jongmin Lee
01 Aug 2024
01 Aug 2024

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-hop graph transformer network for 3D human pose estimation

Abstract

Talk to us

Similar Papers

More From: Journal of Visual Communication and Image Representation