Abstract

Recent studies estimate human anatomical key points through the single monocular image, in which multichannel heatmaps are the key factor in determining the quality of human pose estimation. Multichannel heatmaps can efficiently handle the image-to-coordinate mapping task and the processing of semantic features. Most methods ignore physical constraints and internal relationships of human body parts, which easily misclassify left and right symmetrical parts as similar features. Some studies use RNNs on the top to incorporate priors about the structure of pose components and body configuration. Therefore, a novel top-down convolutional network is proposed to consider these priors during training, which can improve the robustness under complex field conditions in the wild. In order to learn the prior knowledge of human pose configuration, the hierarchy of fully convolutional networks (discriminator) is used to distinguish real poses from fake ones. Consequently, the pose network is inclined to make a pose estimation that the discriminator misjudges as true, which is reasonable in complex situations. The performance of the method is experimentally validated by pose estimation on the MS COCO human key point detection task. The proposed approach outperforms the original method and generates robust pose predictions, demonstrating efficiency by using adversarial learning.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call