Abstract

Heatmap-based traditional approaches for estimating human pose usually suffer from drawbacks such as high network complexity or suboptimal accuracy. Focusing on the issue of multi-person pose estimation without heatmaps, this paper proposes an end-to-end, lightweight human pose estimation network using a multi-scale coordinate attention mechanism based on the Yolo-Pose network to improve the overall network performance while ensuring the network is lightweight. Specifically, the lightweight network GhostNet was first integrated into the backbone to alleviate the problem of model redundancy and produce a significant number of effective feature maps. Then, by combining the coordinate attention mechanism, the sensitivity of our proposed network to direction and location perception was enhanced. Finally, the BiFPN module was fused to balance the feature information of different scales and further improve the expression ability of convolutional features. Experiments on the COCO 2017 dataset showed that, compared with the baseline method YOLO-Pose, the average accuracy of the proposed network on the COCO 2017 validation dataset was improved by 4.8% while minimizing the amount of network parameters and calculations. The experimental results demonstrated that our proposed method can improve the detection accuracy of human pose estimation while ensuring that the model is lightweight.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call