Abstract

In this paper, we present human pose estimation and gesture recognition algorithms that use only depth information. The proposed methods are designed to be operated with only a CPU (central processing unit), so that the algorithm can be operated on a low-cost platform, such as an embedded board. The human pose estimation method is based on an SVM (support vector machine) and superpixels without prior knowledge of a human body model. In the gesture recognition method, gestures are recognized from the pose information of a human body. To recognize gestures regardless of motion speed, the proposed method utilizes the keyframe extraction method. Gesture recognition is performed by comparing input keyframes with keyframes in registered gestures. The gesture yielding the smallest comparison error is chosen as a recognized gesture. To prevent recognition of gestures when a person performs a gesture that is not registered, we derive the maximum allowable comparison errors by comparing each registered gesture with the other gestures. We evaluated our method using a dataset that we generated. The experiment results show that our method performs fairly well and is applicable in real environments.

Highlights

  • Human pose estimation and gesture recognition are attractive research topics in computer vision and robotics owing to their many applications, including human computer interaction, game control and surveillance

  • The human pose estimation result cannot be directly used for gesture recognition, because the joint positions vary depending on the body shape and the distance to the depth sensor

  • These results show that the proposed human pose estimation algorithm is more suitable for a gesture recognition algorithm that requires fast recognition performance

Read more

Summary

Introduction

Human pose estimation and gesture recognition are attractive research topics in computer vision and robotics owing to their many applications, including human computer interaction, game control and surveillance. Many human pose estimation methods use a GPU (graphic processing unit) to increase the frame rate and the performance [3,4,5,6] These methods shows remarkable performance, but it is difficult to operate the algorithms on low-cost systems, such as embedded boards or mobile platforms. Wu et al proposed a matching-based method that uses dynamic time-warping to identify users and recognize gestures with joint data from Kinect for Xbox 360 [12]. We propose human pose estimation and gesture recognition algorithms that use only depth information for robustness to environmental and lighting changes. The proposed algorithms are designed to be operated on low-cost systems, such as embedded boards and mobile platforms, without exploiting GPUs. Our pose estimation method is based on a per-pixel classification method where each pixel on the human body is classified into a body part.

Human Pose Estimation
Superpixel Feature Generation
Pose Estimation
Gesture Recognition
Key Frame Extraction
Action Sequence Matching
Experiments
Proposed Method
Conclusions
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call