Abstract

Multi-resolution features are important for image-based human pose estimation. In this paper, we present a method to exploit complete information from feature maps of neural network in different resolutions to improve the accuracy of human pose estimation. The proposal, namely Adaptively Complete Multi-Resolution Feature Fusion (AdaCMRFF), is based on a high-resolution network (HRNet). AdaCMRFF fuses all feature maps based on the adaptive parameters which can preserve useful information of different resolution feature maps when fusing into a specific resolution feature map. Firstly, different resolution feature maps are resized to the same shape by sampling and convolution strategies. The fused weight parameters are then generated through 1 \(\times \) 1 convolutions and softmax function which operate on above feature maps. Finally, the feature maps and fused parameters are added to make a new feature map. AdaCMRFF is equipped on all the stages of HRNet to retain handy information of all the feature maps. A series of experiments are conducted on two mainstream human pose estimation datasets, includes COCO2017 and CrowdPose dataset present the effect of the proposed method.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call