Abstract

This work addresses the problem of 3D human body shape and pose estimation from a single depth image. Most 3D human pose estimation methods based on deep learning utilize RGB images instead of depth images. Traditional optimization-based methods using depth images aim to establish point correspondences between the depth images and the template model. In this paper, we propose a novel method to estimate the 3D pose and shape of a human body from depth images. Specifically, based on the joints features and original depth features, we propose a spatial attention feature extractor to capture spatial local features of depth images and 3D joints by learning dynamic weights of the features. In addition, we generalize our method to real depth data through a weakly-supervised method. We conduct extensive experiments on SURREAL, Human3.6M, DFAUST, and real depth images of human bodies. The experimental results demonstrate that our 3D human pose estimation method can yield good performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call