Soft biometric trait classification from real-world face videos conditioned on head pose estimation

Meltem Demirkus,James J Clark,Doina Precup,Tal Arbel

doi:10.1109/cvprw.2012.6239227

Abstract

Recently, soft biometric trait classification has been receiving more attention in the computer vision community due to its wide range of possible application areas. Most approaches in the literature have focused on trait classification in controlled environments, due to the challenges presented by real-world environments, i.e. arbitrary facial expressions, arbitrary partial occlusions, arbitrary and nonuniform illumination conditions and arbitrary background clutter. In recent years, trait classification has started to be applied to real-world environments, with some success. However, the focus has been on estimation from single images or video frames, without leveraging the temporal information available in the entire video sequence. In addition, a fixed set of features are usually used for trait classification without any consideration of possible changes in the facial features due to head pose changes. In this paper, we propose a temporal, probabilistic framework first to robustly estimate continuous head pose angles from real-world videos, and then use this pose estimate to decide on the appropriate set of frames and features to use in a temporal fusion scheme for soft biometric trait classification. Experiments performed on large, real-world video sequences show that our head pose estimator outperforms the current state-of-the-art head pose approaches (by up to 51%), whereas our head pose conditioned biometric trait classifier (for the case of gender classification) outperforms the current state-of-the-art approaches (by up to 31%).

Full Text