Abstract

Accurate pose estimation is crucial for understanding human behavior in images or videos. Given an RGB image, we want to be able to accurately locate some important keypoints on the body. Understanding human pose and body structure is important for high-level tasks such as human-computer interaction. Human pose estimation usually has problems such as low discrimination between human body and background, and human pose estimation based on HRnet network does not make full use of important feature information. To solve these problems, a human pose estimation method MCSA-hrnet (Multi-scale Channel and Spatial Attention) based on multi-scale channel and spatial attention is improved by using channel attention mechanism and spatial attention mechanism. Starting from the channel domain and spatial domain, MCSA-HRnet integrates the multi-level attention mechanism into the high-resolution network structure, and designs the channel attention block and spatial attention block. This enables the network to focus on the regions of the image that are highly associated with the human body and not on other regions. MCSA-HRnet uses 1×1 convolutions for information extraction in the core part of the ca block (channel attention block) and parallel <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\boldsymbol{3\mathrm{x}3}$</tex> and <tex xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">$\boldsymbol{5\mathrm{x}5}$</tex> convolutions in the sa block (spatial attention block). Different sizes of parallel convolutions can derive spatial attention maps of different scales, which makes the ability of the network to distinguish human features from background features more significant. Thus, the human body region and its key points can be accurately located. The improved method is verified on COCO keypoint dataset, and the results show that MCSA-HRnet can effectively improve the accuracy of human pose estimation joint point localization.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call