Abstract

3D human pose estimation is more and more widely used in the real world, such as sports guidance, limb rehabilitation training, augmented reality, and intelligent security. Most existing human pose estimation methods are designed based on an RGB image obtained by one optical sensor, such as a digital camera. There is some prior knowledge, such as bone proportion and angle limitation of joint hinge motion. However, the existing methods do not consider the correlation between different joints from multi-view images, and most of them adopt fixed spatial prior constraints, resulting in poor generalizations. Therefore, it is essential to build a multi-view image acquisition system using optical sensors and customized algorithms for a 3D reconstruction of the human pose in the image. Inspired by generative adversarial networks (GAN), we used a data-driven method to learn the implicit spatial prior information and classified joints according to the natural connection characteristics. To accelerate the proposed method, we proposed a fully connected network with skip connections and used the SMPL model to make the 3D human body reconstruction. Experimental results showed that compared with other state-of-the-art methods, the joints’ average error of the proposed method was the smallest, which indicated the best performance. Moreover, the running time of the proposed method was 1.3 seconds per frame, which may not meet real-time requirements, but is still much faster than most existing methods.

Highlights

  • Human pose estimation (HPE) refers to the detection and positioning of the joint points of the people from the given optical sensor input via algorithms

  • The candidate human body regions were obtained by the candidate pose region generator, and the potential poses were located in the candidate regions

  • We describe the skinned multi-person linear (SMPL) body model and provide the essential notation here

Read more

Summary

Introduction

Human pose estimation (HPE) refers to the detection and positioning of the joint points of the people from the given optical sensor (cameras) input via algorithms. Estimating human pose is the key to analyzing human behavior. HPE is the basic research in computer vision, which can be applied to many applications, such as Human-computer interaction, human action recognition [1,2,3,4], intelligent security, motion capture, and action detection [5]. Rogez et al [6] presented an end-to-end architecture, named LCR-Net. The network included positioning, classification, and regression. The candidate human body regions were obtained by the candidate pose region generator, and the potential poses were located in the candidate regions

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call