Abstract

Recently, most existing human pose estimation methods fuse multi-stage convolutional modules to learn a shared feature representation. In this paper, we propose a expectation–maximization (EM) mapping-based network to learn specific related body parts for human pose estimation, named EMposeNet. It maps specific feature of related parts from the original fully shared feature space. From the perspective of multi-task learning, we can regard the task of human pose estimation as a homogeneous multi-task learning. Sharing features among related tasks can result in a more compact model and better generalization ability. However, sharing features for those unrelated or weakly related tasks will deteriorate the estimation performance. Our proposed method aims at performing EM algorithm to learn the related body part, where the predicted keypoint heatmap is potentially more accurate and spatially more precise. We conduct extensive experiments on two benchmark datasets, including the MSCOCO keypoint detection dataset and the MPII human pose dataset, to empirically demonstrate the validity of the proposed method. The results on such two benchmark datasets show that the proposed approach achieves a competitive performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call