Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation

Fangtai Guo,Xinyue Zhao,Jianrong Tan,Shuyou Zhang,Zaixing He

doi:10.1109/access.2020.2968361

Fangtai Guo, Xinyue Zhao + Show 3 more

Open Access

https://doi.org/10.1109/access.2020.2968361

Copy DOI

Abstract

Most of the existing methods for 3D hand pose estimation are performed from a single depth map. In that case, the depth missing challenges from input frames caused by hand self-occlusions and imaging quality lead to multi-valued mapping phenomenon and sub-optimal model. In this paper, we proposed a novel recurrent architecture named Attention-based Pose Sequence Machine (APSM) to alleviate challenges by introducing temporal consistency. As for recurrent unit (RU), we extend traditional Gated Recurrent Unit (GRU) with 3D convolutional neural networks (CNNs) to handle voxelized inputs and features, and a novel RU named Deep Gated Recurrent Unit (DGRU) was proposed by rebuilding deeper gates based on GRU. To improve the model performance, a novel spatial attention mechanism denoted as Attention Model (AM) was proposed. Ablation experiments are designed to validate each contribution of our work, and experiments on two publicly available dataset show that our work outperforms state-of-the-art on hand pose estimation.

Highlights

Accurate 3D hand pose estimation has been critical technologies for diverse human-computer interaction applications, such as virtual or augmented reality [1], driver interaction [2], and sign language recognition [3]–[5]
As for Recurrent Unit (RU), we proposed novel variant named Deep Gated Recurrent Unit (DGRU), which focuses on rebuilding deeper feature extraction gates
We proposed a novel recurrent architecture named Attention-based Pose Sequence Machine (APSM) for hand pose estimation, which is characterized by introducing temporal consistency to alleviate the depth missing challenges

Summary

INTRODUCTION

Accurate 3D hand pose estimation has been critical technologies for diverse human-computer interaction applications, such as virtual or augmented reality [1], driver interaction [2], and sign language recognition [3]–[5]. Image pairs illustrated in [14] require precise one-to-one correspondences, and it is hard for single frame to guarantee when the real errors are large Both challenges are attributed to depth missing of 3D human hand, and most of recent discriminative approaches [12]–[15] conducted hand estimation from single depth image, which usually leads to sub-optimal model and multi-valued mapping. 1. We proposed a novel recurrent architecture named APSM for hand pose estimation, which is characterized by introducing temporal consistency to alleviate the depth missing challenges. 3. A novel spatial attention model denoted as AM was introduced to act as feature weighting on input feature, and ablation experiments show that AM helps to improve the estimation accuracy.

RELATED WORKS

GATED RECURRENT UNIT

DEEP GATED RECURRENT UNIT

ATTENTION MODEL

NETWORK TESTING

Findings

CONCLUSION

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2020
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

V2V-PoseNet: Voxel-to-Voxel Prediction Network for Accurate 3D Hand and Human Pose Estimation from a Single Depth Map
Ju Yong Chang ... Kyoung Mu Lee
-
Ju Yong Chang, et. al.Ju Yong Chang ... Kyoung Mu Lee
01 Jun 2018
01 Jun 2018

MVPointNet: Multi-View Network for 3D Object Based on Point Cloud
Weiguo Zhou ... Yun-Hui Liu
IEEE Sensors Journal | VOL. 19
Weiguo Zhou, et. al.Weiguo Zhou ... Yun-Hui Liu
15 Dec 2019
IEEE Sensors Journal | VOL. 19

A Deep Neural Network Model for Speaker Identification
Feng Ye ... Jun Yang
Applied sciences | VOL. 11
Feng Ye, et. al.Feng Ye ... Jun Yang
16 Apr 2021
Applied sciences | VOL. 11

Hand pose estimation based on deep learning depth map for hand gesture recognition
Naima Otberdout ... Lahoucine Ballihi
-
Naima Otberdout, et. al.Naima Otberdout ... Lahoucine Ballihi
01 Apr 2017
01 Apr 2017

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Attention-Based Pose Sequence Machine for 3D Hand Pose Estimation

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions