Deep learning has demonstrated remarkable advantages in the field of human pose estimation. However, traditional methods often rely on widening and deepening networks to enhance the performance of human pose estimation, consequently increasing the parameter count and complexity of the networks. To address this issue, this paper introduces Ghost Attentional Down network, a lightweight human pose estimation network based on HRNet. This network leverages the fusion of features from high-resolution and low-resolution branches to boost performance. Additionally, GADNet utilizes GaBlock and GdBlock, which incorporate lightweight convolutions and attention mechanisms, for feature extraction, thereby reducing the parameter count and computational complexity of the network. The fusion of relationships between different channels ensures the optimal utilization of informative feature channels and resolves the issue of feature redundancy. Experimental results conducted on the COCO dataset, with consistent image resolution and environmental settings, demonstrate that employing GADNet leads to a reduction of 60.7% in parameter count and 61.2% in computational complexity compared to the HRNet network model, while achieving comparable accuracy levels. Moreover, when compared to commonly used human pose estimation networks such as Cascaded Pyramid Network (CPN), Stacked Hourglass Network, and HRNet, GADNet achieves high-precision detection of human keypoints even with fewer parameters and lower computational complexity, our network has higher accuracy compared to MobileNet and ShuffleNet.
Read full abstract