Abstract

The estimation of 3D body pose and shape has always been a challenging problem due to various reasons, such as the ambiguity in 2D images and complex articulated structure of the human body. In order to solve the ill-conditioned problems, in this paper, we bring up an end-to-end method to estimate 3D human shape and pose from multi-view RGB images. In the proposed framework, we first implement a CNN embedded with attention module to extract the image feature and design the view-pooling layer to combine the features from multiple views. Then we adopt a regression network with a novel geometric constraint of body limbs to estimate 3D human pose and shape. Additionally, during the training process, we employ the idea of adversarial learning in our model to help regress accurate pose and shape parameters. Extensive experiments are conducted on Human3.6M and MPI-INF-3DHP datasets, and our method achieves competitive results in the 3D pose and shape estimation task.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call