Abstract

Accurate keypoint positioning is necessary for bottom-up multi-person pose estimation methods to handle scale variation and crowdedness. In this paper, we present DoubleHigherNet: a novel network learning scale-aware and precise heatmap representation for bottom-up process using double high-resolution feature pyramids and coarse-to-fine training. The two feature pyramids in DoubleHigherNet consists of 1/4 resolution feature and higher-resolution (1/2) maps generated by attention fusion blocks and transposed convolutions. Benefited by the training strategy, muti-resoltion and coarse-fine heatmap aggregation, the proposed approach is able to predict keypoints more accurately so as to perform better on difficult crowded scenes. DoubleHigherNet-w32 achieves competitive result on CrowdPose-test, surpassing all the top-down methods and bottom-up SOTA HigherHRNet-w32 (which possesses similar number of params with DoubleHigherNet-w32).

Highlights

  • We propose a DoubleHigherNet with two cascaded feature pyramids

  • We evaluate the impact of our proposed attention fusion block, coarse-to-fine learning and coarse-fine heatmap aggregation

  • In this paper, we present DoubleHigherNet: a novel network designed for bottom-up muti-person pose estimation

Read more

Summary

Introduction

A top-down method first employs a human detector such as Mask-Rcnn (He et al.2017) to obtain the bounding-box of each person instance in the image. The bottom-up process first determines the identity-free joints position of all people in the input image by predicting the heatmaps of different body parts, and groups them into instances of different people. This strategy effectively improves the speed of bottom-up methods and their ability to realize real-time pose estimation. Independent of human detector, bottom-up methods perform better on crowd-pose, a benchmark with various dense and difficult scenes. As there is a conflict between estimating small persons and large persons, the second strategy, feature pyramid, is introduced by HigherHRNet [3] to balance the performance on persons of different scales

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.