Facial Analysis from Continuous Video with Applications to Human-Computer Interface

Thomas S Huang ,Antonio J Colmenarez

doi:10.1007/b101848

Abstract

This thesis is about computer vision algorithms for the analysis of video data involving faces. This kind of video, obtained for example from a camera aimed to the user of some interactive system, is potentially useful to enhance the interface between users and machines. These image sequences provide information from which machines can identify and keep track of their users, recognize their facial expressions and gestures, and complement other forms of human-computer interfaces. First, we present a learning technique based on information-theoretic discrimination which is used to construct face and facial feature detectors. Next, we describe a real-time system for face and facial feature detection and tracking in continuous video. Last, we present a probabilistic framework for embedded face and facial expression recognition from image sequences. The aforementioned learning technique, referred to in this thesis as information-based maximum discrimination, uses the information-theoretic divergence as the optimization criterion to maximize the discrimination between two classes of objects. Then, the likelihood functions obtained with this learning technique are used for object detection as in maximum likelihood classification between two classes of objects, i.e., faces and background. Using discrete data and probability models, the learning procedure and object classification algorithm are very efficiently implemented. This has allowed us to develop a real-time system capable of detecting and tracking multiple faces in complex backgrounds and nine facial features. The algorithm described in this thesis for embedded face and facial expression recognition is based on a novel probabilistic framework. In this novel framework, faces are modeled not only by their appearance, but also by the spatio-temporal deformation pattern of their expressions. Face recognition and facial expression recognition are carried out in a maximum likelihood setup. Given an image sequence, the algorithm finds the person's model and the facial expression that maximizes the likelihood probability of the observed images. In this framework, facial appearance matching is enhanced by facial expression modeling. Also, changes in facial features due to expressions are used together with facial deformation patterns to perform expression recognition.

Full Text