Multi-Person Pose Estimation Using an Orientation and Occlusion Aware Deep Learning Network.

Yanlei Gu,Shunsuke Kamijo,Huiyang Zhang

doi:10.3390/s20061593

Yanlei Gu, Shunsuke Kamijo + Show 1 more

Open Access

https://doi.org/10.3390/s20061593

Copy DOI

Journal: Sensors	Publication Date: Mar 12, 2020
Citations: 9	License type: CC BY 4.0

Affiliation: Ritsumeikan University, The University of Tokyo

Abstract

Image based human behavior and activity understanding has been a hot topic in the field of computer vision and multimedia. As an important part, skeleton estimation, which is also called pose estimation, has attracted lots of interests. For pose estimation, most of the deep learning approaches mainly focus on the joint feature. However, the joint feature is not sufficient, especially when the image includes multi-person and the pose is occluded or not fully visible. This paper proposes a novel multi-task framework for the multi-person pose estimation. The proposed framework is developed based on Mask Region-based Convolutional Neural Networks (R-CNN) and extended to integrate the joint feature, body boundary, body orientation and occlusion condition together. In order to further improve the performance of the multi-person pose estimation, this paper proposes to organize the different information in serial multi-task models instead of the widely used parallel multi-task network. The proposed models are trained on the public dataset Common Objects in Context (COCO), which is further augmented by ground truths of body orientation and mutual-occlusion mask. Experiments demonstrate the performance of the proposed method for multi-person pose estimation and body orientation estimation. The proposed method can detect 84.6% of the Percentage of Correct Keypoints (PCK) and has an 83.7% Correct Detection Rate (CDR). Comparisons further illustrate the proposed model can reduce the over-detection compared with other methods.

Highlights

Human pose estimation is defined as the problem of the localization of human joints in images or videos
(Common Objects in Context) dataset is an open dataset built by Microsoft and Facebook, etc., which has a large volume of images for general object detection and segmentation tasks
The proposed network model is the extended multi-task network based on a Mask Region-based Convolutional Neural Networks (R-Convolutional Neural Networks (CNN)). Layer heads, and it consists of five tasks: (1) joint position estimation, (2) body segmentation, (3) joint visibility mask, (4) body orientation recognition and (5) mutual-occlusion mask, the five tasks are separated into three branches: body segmentation branch, joint position estimation branch and orientation-occlusion branch

Summary

Introduction

Human pose estimation is defined as the problem of the localization of human joints ( known as key points—elbows, wrists, etc.) in images or videos. Pose estimation recently received significant attention from other research fields because of the valuable information contained in data of the human pose

Non-Deep Neural Network Approach

Deep Neural Network for Single Person Pose Estimation

Deep Neural Network for Multi-Person Pose Estimation

Information Used for Human Pose Estimation

Limitation

Parallel Multi‐Task Network for Pose Estimation

Body Segmentation Branch

The body segmentation branch predicts forRoI each aconvolutional

Joint Position Estimation Branch

This branch consists

Architecture

Joint position combined with body segmentation

Results

COCO Keypoint Dataset

Extended Sub Dataset with Mutual-Occlusion and Body Orientation

Dataset for Training and Evaluation

Evaluation for Joint Position Estimation

Evaluation

Comparison with Other Methods

17. Examples

Conclusions

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multi-Person Pose Estimation Using an Orientation and Occlusion Aware Deep Learning Network.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors

Lead the way for us

Similar Papers

FibeR-CNN: Expanding Mask R-CNN to improve image-based fiber analysis
M Frei ... F.E Kruis
Powder Technology | VOL. 377
M Frei, et. al.M Frei ... F.E Kruis
15 Aug 2020
Powder Technology | VOL. 377

110 Evaluation of Computer Vision to Analyze Beef Cattle Feeding Behavior
Egleu D M Mendes ... Yalong Pi
Journal of Animal Science | VOL. 101
Egleu D M Mendes, et. al.Egleu D M Mendes ... Yalong Pi
06 Nov 2023
Journal of Animal Science | VOL. 101

Mask R-CNN based automated identification and extraction of oil well sites
Hongjie He ... Huxiong Li
International Journal of Applied Earth Observation and Geoinformation | VOL. 112
Hongjie He, et. al.Hongjie He ... Huxiong Li
01 Aug 2022
International Journal of Applied Earth Observation and Geoinformation | VOL. 112

Body and Head Orientation Estimation with Privacy Preserving LiDAR Sensors
Onur N Tepencelik ... Pamela C Cosman
-
Onur N Tepencelik, et. al.Onur N Tepencelik ... Pamela C Cosman
23 Aug 2021
23 Aug 2021

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multi-Person Pose Estimation Using an Orientation and Occlusion Aware Deep Learning Network.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Sensors