Abstract

In this paper we propose a subspace learning algorithm based on supervised manifold learning techniques to address the problem of inferring 3D human poses from monocular video frames. Low-dimensional representations of visual features are computed via spectral embedding, regularized by the pairwise relationship of poses for simultaneously preserving the locality in the feature space and taking account of similarities in the pose space. To deal with the “out-of-sample” problem, we obtain a global linear projection from the embedding whereby the Euclidean distances between transformed feature vectors can faithfully reflect the corresponding pose distances. To retrieve the most similar candidate from the exemplar database, weighted sum of Euclidean distances of features is employed to achieve better accuracy instead of simply summing up the squared distances of all feature types. The experimental results on HumanEva dataset validate the efficacy of our proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call