Deep Full-Body HPE for Activity Recognition from RGB Frames Only

Sameh Neili Boualia,Najoua Essoukri Ben Amara

doi:10.3390/informatics8010002

Sameh Neili Boualia, Najoua Essoukri Ben Amara

Open Access

https://doi.org/10.3390/informatics8010002

Copy DOI

Abstract

Human Pose Estimation (HPE) is defined as the problem of human joints’ localization (also known as keypoints: elbows, wrists, etc.) in images or videos. It is also defined as the search for a specific pose in space of all articulated joints. HPE has recently received significant attention from the scientific community. The main reason behind this trend is that pose estimation is considered as a key step for many computer vision tasks. Although many approaches have reported promising results, this domain remains largely unsolved due to several challenges such as occlusions, small and barely visible joints, and variations in clothing and lighting. In the last few years, the power of deep neural networks has been demonstrated in a wide variety of computer vision problems and especially the HPE task. In this context, we present in this paper a Deep Full-Body-HPE (DFB-HPE) approach from RGB images only. Based on ConvNets, fifteen human joint positions are predicted and can be further exploited for a large range of applications such as gesture recognition, sports performance analysis, or human-robot interaction. To evaluate the proposed deep pose estimation model, we apply it to recognize the daily activities of a person in an unconstrained environment. Therefore, the extracted features, represented by deep estimated poses, are fed to an SVM classifier. To validate the proposed architecture, our approach is tested on two publicly available benchmarks for pose estimation and activity recognition, namely the J-HMDBand CAD-60datasets. The obtained results demonstrate the efficiency of the proposed method based on ConvNets and SVM and prove how deep pose estimation can improve the recognition accuracy. By means of comparison with state-of-the-art methods, we achieve the best HPE performance, as well as the best activity recognition precision on the CAD-60 dataset.

Highlights

The amount of available video data is explosively expanding due to the pervasiveness of digital recording devices
Based on the work of Charles et al [48], a joint is considered to be correctly located if it is within a set distance of d pixels from a marked joint center in the Ground Truth (GT)
For the CAD-60 dataset, different pose estimation results are presented in Figure 6 as accuracy graphs according to the allowed distance from the GT after applying the four-fold cross-validation process

Summary

Introduction

The amount of available video data is explosively expanding due to the pervasiveness of digital recording devices. Previous works on HPE have commonly used graphical models for estimating human poses. Those models are composed of joints and rigid parts. In [7], the authors presented a graphical model for HPE with image-dependent pairwise relations They used the local image measurements, to detect joints, and to predict the spatial relationships between them. This aims to learn conditional probabilities for the presence of parts and their spatial relationships. After that, another approach was proposed using puppets [8]. It estimates the body poses at one frame, checks its performance in neighboring ones using the optical flow

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Informatics	Publication Date: Jan 18, 2021
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Deep Full-Body HPE for Activity Recognition from RGB Frames Only

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informatics

Lead the way for us

Similar Papers

A Multi-Task Neural Network for Action Recognition with 3D Key-Points
Rongxiao Tang ... Luyang Wang
-
Rongxiao Tang, et. al.Rongxiao Tang ... Luyang Wang
10 Jan 2021
10 Jan 2021

3D hand pose and shape estimation from RGB images for keypoint-based hand gesture recognition
Danilo Avola ... Daniele Pannone
Pattern Recognition | VOL. 129
Danilo Avola, et. al.Danilo Avola ... Daniele Pannone
30 Apr 2022
Pattern Recognition | VOL. 129

3D Human pose estimation: A review of the literature and analysis of covariates
Nikolaos Sarafianos ... Ioannis A Kakadiaris
Computer Vision and Image Understanding | VOL. 152
Nikolaos Sarafianos, et. al.Nikolaos Sarafianos ... Ioannis A Kakadiaris
08 Sep 2016
Computer Vision and Image Understanding | VOL. 152

Localization of Human 3D Joints Based on Binocular Vision
Zheng Xu ... Yanchun Wu
-
Zheng Xu, et. al.Zheng Xu ... Yanchun Wu
01 Jan 2019
01 Jan 2019

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Deep Full-Body HPE for Activity Recognition from RGB Frames Only

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Informatics