Abstract
The use of video sequences for face recognition has been relatively less studied compared to image-based approaches. In this paper, we present an analysis-by-synthesis framework for face recognition from video sequences that is robust to large changes in facial pose and lighting conditions. This requires tracking the video sequence, as well as recognition algorithms that are able to integrate information over the entire video; we address both these problems. Our method is based on a recently obtained theoretical result that can integrate the effects of motion, lighting, and shape in generating an image using a perspective camera. This result can be used to estimate the pose and structure of the face and the illumination conditions for each frame in a video sequence in the presence of multiple point and extended light sources. We propose a new inverse compositional estimation approach for this purpose. We then synthesize images using the face model estimated from the training data corresponding to the conditions in the probe sequences. Similarity between the synthesized and the probe images is computed using suitable distance measurements. The method can handle situations where the pose and lighting conditions in the training and testing data are completely disjoint. We show detailed performance analysis results and recognition scores on a large video dataset.
Highlights
It is believed by many that video-based facerecognition systems hold promise in certain applications where motion can be usedas a cue for face segmentation and tracking, and the presence of more data can increase recognition performance [1]
We present a novel analysis-by-synthesis framework for pose and illumination invariant, video-based face recognition that is based on (i) learning joint illumination and motion models from video, (ii) synthesizing novel views based on the learned parameters, and (iii) designing measurements that can compare two time sequences while being robust to outliers
We can handle a variety of lighting conditions, including the presence of multiple point and extended light sources, which is natural in outdoor environments
Summary
It is believed by many that video-based facerecognition systems hold promise in certain applications where motion can be usedas a cue for face segmentation and tracking, and the presence of more data can increase recognition performance [1]. We present a novel analysis-by-synthesis framework for pose and illumination invariant, video-based face recognition that is based on (i) learning joint illumination and motion models from video, (ii) synthesizing novel views based on the learned parameters, and (iii) designing measurements that can compare two time sequences while being robust to outliers. We show experimentally that our method achieves high identification rates under extreme changes of pose and illumination
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.