Abstract
Accurate keypoint positioning is necessary for bottom-up multi-person pose estimation methods to handle scale variation and crowdedness. In this paper, we present DoubleHigherNet: a novel network learning scale-aware and precise heatmap representation for bottom-up process using double high-resolution feature pyramids and coarse-to-fine training. The two feature pyramids in DoubleHigherNet consists of 1/4 resolution feature and higher-resolution (1/2) maps generated by attention fusion blocks and transposed convolutions. Benefited by the training strategy, muti-resoltion and coarse-fine heatmap aggregation, the proposed approach is able to predict keypoints more accurately so as to perform better on difficult crowded scenes. DoubleHigherNet-w32 achieves competitive result on CrowdPose-test, surpassing all the top-down methods and bottom-up SOTA HigherHRNet-w32 (which possesses similar number of params with DoubleHigherNet-w32).
Highlights
We propose a DoubleHigherNet with two cascaded feature pyramids
We evaluate the impact of our proposed attention fusion block, coarse-to-fine learning and coarse-fine heatmap aggregation
In this paper, we present DoubleHigherNet: a novel network designed for bottom-up muti-person pose estimation
Summary
A top-down method first employs a human detector such as Mask-Rcnn (He et al.2017) to obtain the bounding-box of each person instance in the image. The bottom-up process first determines the identity-free joints position of all people in the input image by predicting the heatmaps of different body parts, and groups them into instances of different people. This strategy effectively improves the speed of bottom-up methods and their ability to realize real-time pose estimation. Independent of human detector, bottom-up methods perform better on crowd-pose, a benchmark with various dense and difficult scenes. As there is a conflict between estimating small persons and large persons, the second strategy, feature pyramid, is introduced by HigherHRNet [3] to balance the performance on persons of different scales
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have