Abstract

Severe error accumulation is a key reason that makes existing multi-person pose estimation challenging in complex scenes. Single-stage pose estimation methods detect human and keypoints in parallel, avoiding serial error accumulation compared to two-stage methods. However, two distinct tasks share identical features in the existing method, which causes error accumulation. Therefore, we propose a Feature Decoupling Network (FDNet) to further extend the pipeline of single-stage methods and reconstruct task-specific features, including Feature Decoupling Module (FDM), Human Aware Loss (HAL) and Keypoint Aware Loss (KAL). FDM can adaptively perceive spatial and channel features and allocate separate feature domains for various tasks. To learn distinctive feature representation for compact human bodies, HAL and KAL measure and suppress feature similarities and distances for different human and keypoints, thus alienating feature interference. Experiments on crowded datasets show that our method is superior, outperforming the state-of-the-art method CID by 1.5%, 0.8% on OCHuman and CrowdPose.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call