Abstract

We tackle the problem of 3D human pose estimation based on monocular images from which 2D pose estimates are available. A large number of approaches have been proposed for this task. Some of them avoid to model the mapping from 2D poses to 3D poses explicitly but learn the mapping using training samples. In contrast, there also exist methods that try to use some knowledge about the connection between 2D and 3D poses to model the mapping from 2D to 3D explicitly. Surprisingly, up to now there is no experimental comparison of these two classes of approaches that uses exactly the same data sources and thereby carves out the advantages and disadvantages of both methods. In this paper we present such a comparison for the most commonly used learning approach for 3D pose estimation - the Gaussian process regressor - with the most used modeling approach - the geometric reconstruction of 3D poses. The results show that the learning based approach outperforms the modeling approach when there are no big changes in viewpoint or action types compared to the training data. In contrast, modeling approaches show advantages over learning approaches when there are big differences between training and application data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call