Abstract

Multi-person pose estimation is the task of estimating the coordinates of body joints and predicting the body poses for multiple people in an images. This problem has made breakthroughs in recent years, but the solutions still suffer from some shortcomings. A serious weakness of the state-of-the-art models is the number of poses detected by these models which is generally much larger than the actual number of human instances in the input image. This makes the existing models unreliable and thus unusable in real-world tasks. In this paper, we propose a more reliable multi-person pose estimation method consisting of three main blocks: a top-down multi-person pose estimation, a human detection, and a pose selection block. The proposed method incorporates the bounding box of the segmented objects to select the best subset of the initial pose set. We formulate the pose selection problem using Conditional Random Fields. First, we introduce a set of potential functions to form a general probability model. Then an inference algorithm is proposed to select the best poses which maximize the probability function. Finally, the proposed solution is implemented by a neural network. The proposed pose selection model is a model-agnostic method that can be easily used in conjunction with other pose estimation and object detection models. Experiments demonstrate that the reliability and precision of the proposed model are higher than those of the state-of-the-art models.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call