Abstract

In this paper we describe a system for the automatic detection of multiple people in a scene, by only using depth information provided by a Time of Flight (ToF) camera placed in overhead position. The main contribution of this work lies in the proposal of a methodology for determining the Regions of Interest (ROI’s) and feature extraction, which result in a robust discrimination between people with or without accessories and objects (either static or dynamic), even when people and objects are close together. Since only depth information is used, the developed system guarantees users’ privacy. The designed algorithm includes two stages: an online stage, and an offline one. In the offline stage, a new depth image dataset has been recorded and labeled, and the labeled images have been used to train a classifier. The online stage is based on robustly detecting local maximums in the depth image (which are candidates to correspond to the head of the people present in the scene), from which a carefully ROI is defined around each of them. For each ROI, a feature vector is extracted, providing information on the top view of people and objects, including information related to the expected overhead morphology of the head and shoulders. The online stage also includes a pre-filtering process, in order to reduce noise in the depth images. Finally, there is a classification process based on Principal Components Analysis (PCA). The online stage works in real time at an average of 150 fps. In order to evaluate the proposal, a wide experimental validation has been carried out, including different number of people simultaneously present in the scene, as well as people with different heights, complexions, and accessories. The obtained results are very satisfactory, with a 3.1% average error rate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call