MR-CapsNet: A Deep Learning Algorithm for Image-Based Head Pose Estimation on CapsNet

Hao Fang,Xin-Yu Zhang,Jun-Qing Liu,Kai Xie,Peng Wu,Jian-Biao He,Chang Wen

doi:10.1109/access.2021.3119615

Hao Fang, Xin-Yu Zhang + Show 5 more

Open Access

https://doi.org/10.1109/access.2021.3119615

Copy DOI

Abstract

Head pose estimation based on a single image is a challenging endeavor because of the complex background conditions and characteristics of the human face. In this report, we propose a Multi stage Regression-Capsule Network (MR-CapsNet) to predict head posture based on a single image input. In the study, we used the residual attention block and squeeze-and-excitation block to extract features in three levels. CapsNet overcomes the shortcomings of the traditional convolutional neural network and implements module aggregation to describe the spatial relationship of features after aggregation, in addition to realizing a compact and robust model using a multi-stage regression scheme. We tested our method on the AFLW2000 and BIWI datasets obtaining mean absolute errors of 4.26% and 3.95%, respectively. In addition, we discuss the accuracy of our method in the case of eye or mouth occlusion. The results of comprehensive experiments reveal that our method can accurately predict head posture.

Highlights

The development of a variety of perceptual devices has served as the basis for recent advancements in personalized entertainment
We applied the capsule structure of the network during the feature aggregation stage of head pose estimation, constructed intermediate capsules using the "vertical and horizontal sliding method Windows" to select feature information, and used the linear combination method between capsules to enhance the representative ability of capsules
WORK In this study, we developed a deep neural network model MR-CapsNet to predict head posture

Summary

INTRODUCTION

The development of a variety of perceptual devices has served as the basis for recent advancements in personalized entertainment. In the model-based method, Martins [23] proposed a framework to automatically estimate the pose of the human head in a single-view image. We applied the capsule structure of the network during the feature aggregation stage of head pose estimation, constructed intermediate capsules using the "vertical and horizontal sliding method Windows" to select feature information, and used the linear combination method between capsules to enhance the representative ability of capsules. The capsule neural network linearly combines the information graphs, and passes them through a dynamic routing algorithm to obtain richer feature information, which enhances the network's ability to understand the extracted facial features and reduces the impact of missing facial feature information on the prediction results. We combine the feature maps of the three stages to perform multi-stage regression to obtain the required probability vectors to improve our prediction accuracy

Feature Extraction

MULTISTAGE REGRESSION

EXPERIMENTAL RESULTS AND DISCUSSION

EXPERIMENTAL CRITERION

EXPERIMENTAL RESULT AND ANALYSIS

2) EVALUATION OF THE LABORATORY MODEL

3) EVALUATION IN THE PARTIALLY OCCLUDED CASE

CONCLUSION AND FUTURE WORK

AUTHOR CONTRIBUTIONS

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE access : practical innovations, open solutions	Publication Date: Jan 1, 2021
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

MR-CapsNet: A Deep Learning Algorithm for Image-Based Head Pose Estimation on CapsNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions

Lead the way for us

Similar Papers

GANPOP: Generative Adversarial Network Prediction of Optical Properties From Single Snapshot Wide-Field Images.
Mason T Chen ... Jordan A Sweer
IEEE Transactions on Medical Imaging | VOL. 39
Mason T Chen, et. al.Mason T Chen ... Jordan A Sweer
27 Dec 2019
IEEE Transactions on Medical Imaging | VOL. 39

Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning
Abdullah Abuolaim ... Mahmoud Afifi
-
Abdullah Abuolaim, et. al.Abdullah Abuolaim ... Mahmoud Afifi
01 Jan 2021
01 Jan 2021

Automatic single-view character model reconstruction
Philip Buchanan ... R Mukundan
-
Philip Buchanan, et. al.Philip Buchanan ... R Mukundan
19 Jul 2013
19 Jul 2013

An Improved Novel View Synthesis Approach Based on Feature Fusion and Channel Attention
Lei Jiang ... Gerald Schaefer
-
Lei Jiang, et. al.Lei Jiang ... Gerald Schaefer
09 Oct 2022
09 Oct 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

MR-CapsNet: A Deep Learning Algorithm for Image-Based Head Pose Estimation on CapsNet

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE access : practical innovations, open solutions