Skeleton-Based Spatio-Temporal U-Network for 3D Human Pose Estimation in Video.

Weiwei Li,Rong Du,Shudong Chen

doi:10.3390/s22072573

Abstract

Despite the great progress in 3D pose estimation from videos, there is still a lack of effective means to extract spatio-temporal features of different granularity from complex dynamic skeleton sequences. To tackle this problem, we propose a novel, skeleton-based spatio-temporal U-Net(STUNet) scheme to deal with spatio-temporal features in multiple scales for 3D human pose estimation in video. The proposed STUNet architecture consists of a cascade structure of semantic graph convolution layers and structural temporal dilated convolution layers, progressively extracting and fusing the spatio-temporal semantic features from fine-grained to coarse-grained. This U-shaped network achieves scale compression and feature squeezing by downscaling and upscaling, while abstracting multi-resolution spatio-temporal dependencies through skip connections. Experiments demonstrate that our model effectively captures comprehensive spatio-temporal features in multiple scales and achieves substantial improvements over mainstream methods on real-world datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Sensors (Basel, Switzerland)	Publication Date: Mar 28, 2022
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Skeleton-Based Spatio-Temporal U-Network for 3D Human Pose Estimation in Video.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)

Lead the way for us

Similar Papers

A self-supervised spatio-temporal attention network for video-based 3D infant pose estimation
Wang Yin ... Ming Yi
Medical Image Analysis | VOL. 96
Wang Yin, et. al.Wang Yin ... Ming Yi
18 May 2024
Medical Image Analysis | VOL. 96

A Multi-Task Neural Network for Action Recognition with 3D Key-Points
Rongxiao Tang ... Zhenhua Guo
-
Rongxiao Tang, et. al.Rongxiao Tang ... Zhenhua Guo
10 Jan 2021
10 Jan 2021

Weakly-supervised pre-training for 3D human pose estimation via perspective knowledge
Zhongwei Qiu ... Dongmei Fu
Pattern Recognition | VOL. 139
Zhongwei Qiu, et. al.Zhongwei Qiu ... Dongmei Fu
05 Mar 2023
Pattern Recognition | VOL. 139

LCR-Net++: Multi-Person 2D and 3D Pose Detection in Natural Images.
Gregory Rogez ... Philippe Weinzaepfel
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 42
Gregory Rogez, et. al.Gregory Rogez ... Philippe Weinzaepfel
01 Jan 2019
IEEE Transactions on Pattern Analysis and Machine Intelligence | VOL. 42

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Skeleton-Based Spatio-Temporal U-Network for 3D Human Pose Estimation in Video.

Abstract

Talk to us

Similar Papers

More From: Sensors (Basel, Switzerland)