Abstract

In recent years, human pose estimation has been widely used in human-computer interaction, augmented reality, video surveillance, and many other fields, but the task of pose estimation still faces many challenges. To address the large number of parameters and complicated calculation in the current mainstream human pose estimation network, this paper proposes a lightweight pose estimation network (Lightweight Polarized Network, referred to as LPNet) based on a polarized self-attention mechanism. First, ghost convolution is used to reduce the number of parameters of the feature extraction network; second, by introducing the polarized self-attention module, the pixel-level regression task can be better solved, the lack of extracted features due to the decrease in the number of parameters can be reduced, and the accuracy of the regression of human keypoints can be improved; finally, a new coordinate decoding method is designed to reduce the error in the heatmap decoding process and improve the accuracy of keypoint regression. The method proposed in this paper was evaluated on the human keypoint detection datasets COCO and MPII, and compared with the current mainstream methods. The experimental results show that the proposed method greatly reduces the number of parameters of the model while ensuring a small loss in accuracy.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call