Human Pose Estimation Based on Multi-Spectral Attention and High Resolution Network

Wanyi Ma,Deping Zhang

doi:10.3724/sp.j.1089.2022.19160

Abstract

In view of the problem of feature information loss during multi-resolution feature fusion in human pose estimation, a lightweight high resolution human pose estimation network named Lite MSA-HRNet is designed based on Lite-HRNet and multi-spectral attention mechanism, which integrates multi-spectral attention mechanism into Lite-HRNet. Multiple frequency components are used to extract richer feature information, contributing to the repeated fusion of different resolution feature. A deconvolution module is used behind the main network to fuse the higher resolution features generated by itself with the high resolution features generated by the main network. Channel shuffle, pointwise group convolutions and depthwise separable convolution are introduced to lighten the residual block in the deconvolution module and improve the speed of network positioning key points. The experimental results on the COCO2017 data set show that Lite MSA-HRNet achieves a better balance between the accuracy and complexity of human posture estimation compared with other networks.

Full Text