Abstract

Human poses are difficult to estimate due to the complicated body structure and the self-occlusion problem. In this paper, we introduce a marker-less system for human pose estimation by detecting and tracking key body parts, namely the head, hands, and feet. Given color and depth images captured by multiple red, green, blue, and depth (RGB-D) cameras, our system constructs a graph model with segmented regions from each camera and detects the key body parts as a set of extreme points based on accumulative geodesic distances in the graph. During the search process, local detection using a supervised learning model is utilized to match local body features. A final set of extreme points is selected with a voting scheme and tracked with physical constraints from the unified data received from the multiple cameras. During the tracking process, a Kalman filter-based method is introduced to reduce positional noises and to recover from a failure of tracking extremes. Our system shows an average of 87% accuracy against the commercial system, which outperforms the previous multi-Kinects system, and can be applied to recognize a human action or to synthesize a motion sequence from a few key poses using a small set of extremes as input data.

Highlights

  • The detection of human body parts has been popularly researched in the computer vision and pattern recognition fields

  • Given a sequence of color and depth images streamed from a single RGB-D camera, the background information is subtracted from the images to isolate a human object based on the depth information as it is robust to the illumination changes

  • We introduced a marker-less system for human pose estimation by detecting and tracking key body parts: Head, hands, and feet

Read more

Summary

Introduction

The detection of human body parts has been popularly researched in the computer vision and pattern recognition fields. Using multiple cameras around the user, a set of depth images captured from different viewpoints can be combined to complement the occluded body parts [5,6,7,8,9,10,11] In these approaches, an optimization problem should be solved to track a list of joints in an articulated model from the depth volume.

Related Work
Single Camera Process
Background Subtraction
Graph Construction
Body Parts Detection
Multi-Camera Process
Data Unification
Body Parts Tracking
Noise Removal and Failure Recovery
Body Orientation Estimation
Experimental Results
Tracking Accuracy
Action Recognition
Motion Synthesis
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call