Multi-modal RGB–Depth–Thermal Human Body Segmentation

Cristina Palmero,Albert Clapés,Thomas B Moeslund,Andreas Møgelmose,Chris Bahnsen,Sergio Escalera

doi:10.1007/s11263-016-0901-x

Abstract

This work addresses the problem of human body segmentation from multi-modal visual cues as a first stage of automatic human behavior analysis. We propose a novel RGB---depth---thermal dataset along with a multi-modal segmentation baseline. The several modalities are registered using a calibration device and a registration algorithm. Our baseline extracts regions of interest using background subtraction, defines a partitioning of the foreground regions into cells, computes a set of image features on those cells using different state-of-the-art feature extractions, and models the distribution of the descriptors per cell using probabilistic models. A supervised learning algorithm then fuses the output likelihoods over cells in a stacked feature vector representation. The baseline, using Gaussian mixture models for the probabilistic modeling and Random Forest for the stacked learning, is superior to other state-of-the-art methods, obtaining an overlap above 75 % on the novel dataset when compared to the manually annotated ground-truth of human segmentations.

Full Text