Human pose estimation in complex background videos via Transformer-based multi-scale feature integration

Chen Cheng,Huahu Xu

doi:10.1016/j.displa.2024.102805

Abstract

Human posture estimation is still a hot research topic. Previous algorithms based on traditional machine learning have difficulties in feature extraction and low fusion efficiency. To address these problems, we proposed a Transformer-based method. We combined three techniques, namely the Transformer-based feature extraction module, the multi-scale feature fusion module, and the occlusion processing mechanism, to capture the human pose. The Transformer-based feature extraction module uses the self-attention mechanism to extract key features from the input sequence, the multi-scale feature fusion module fuses feature information of different scales to enhance the perception ability of the model, and the occlusion processing mechanism can effectively handle occlusion in the data and effectively remove background interference. Our method has shown excellent performance through verification on the standard dataset Human3.6M and the wild video dataset, achieving accurate pose prediction in both complex actions and challenging samples.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Human pose estimation in complex background videos via Transformer-based multi-scale feature integration

Abstract

Talk to us

Similar Papers

More From: Displays

Lead the way for us

Similar Papers

Intelligent Fault Diagnosis Method of Wind Turbines Planetary Gearboxes Based on a Multi-Scale Dense Fusion Network
Xinghua Huang ... Yuanyuan Li
Frontiers in Energy Research | VOL. 9
Xinghua Huang, et. al.Xinghua Huang ... Yuanyuan Li
29 Nov 2021
Frontiers in Energy Research | VOL. 9

A Deeply Supervised Convolutional Neural Network for Pavement Crack Detection With Multiscale Feature Fusion.
Zhong Qu ... Dong-Yang Zhou
IEEE transactions on neural networks and learning systems | VOL. 33
Zhong Qu, et. al.Zhong Qu ... Dong-Yang Zhou
15 Mar 2021
IEEE transactions on neural networks and learning systems | VOL. 33

Water Body Extraction in Remote Sensing Imagery Using Domain Adaptation-Based Network Embedding Selective Self-Attention and Multi-Scale Feature Fusion
Jiahang Liu ... Yue Wang
Remote Sensing | VOL. 14
Jiahang Liu, et. al.Jiahang Liu ... Yue Wang
23 Jul 2022
Remote Sensing | VOL. 14

Multi-stream attentive generative adversarial network for dynamic scene deblurring
Jinkai Cui ... Weiguo Gong
Neurocomputing | VOL. 383
Jinkai Cui, et. al.Jinkai Cui ... Weiguo Gong
04 Dec 2019
Neurocomputing | VOL. 383

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Human pose estimation in complex background videos via Transformer-based multi-scale feature integration

Abstract

Talk to us

Similar Papers

More From: Displays