Abstract

In a crowded public space, body and head pose can provide useful information for understanding human behaviours and intentions. In this paper, we propose a novel framework for locating people and inferring their body and head poses. Human detection and pose estimation are two closely related problems but have been tackled independently in previous studies. In this work, we advocate joint detection and recognition of both head and body poses. Our framework is based on learning an ensemble of pose-sensitive human body models whose outputs provide a new representation for poses. To avoid tedious and inconsistent manual annotation for learning pose-sensitive models, we formulate a semi-supervised learning method for model training which bootstraps an initial model using a small set of labelled data, and subsequently improves the model iteratively by data mining from a large unlabelled dataset. Experiments using data from a busy underground station demonstrate that the proposed method significantly outperforms a state-of-the-art person detector and is able to yield extremely accurate head and body pose estimation in crowded public spaces.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call