Abstract

Human motion recognition based on computer vision plays an important role in many fields, such as video surveillance, virtual reality, and medical care. To solve the inaccurate multi-person pose estimation problem and improve the generalizability of the extracted features, this paper proposes a multi-person pose estimation method based on a deep convolutional neural network. This method mainly relies on a top-down structure which includes two stages. In the first stage, the bounding boxes that are likely to contain people are first detected by an improved faster R-CNN. Individuals in the complex scenario are then tailored by box cropping. In the second stage, we combine heatmap detection with coordinate regression to address the single person pose estimation problem. Specially, a deep convolutional ResNet is employed to produce heatmaps of human body. The precise location of each joint is achieved by the fully connected conditional random field. Experimental results demonstrate our method achieves comparable performance with the state-of-the-art ones.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call