Abstract

In the biometrics community, face and speaker recognition are mature fields in which several systems have been proposed over the past twenty years. While existing systems perform well under controlled recording conditions, mismatch caused by the use of different sensors or a lack of cooperation from the subject still significantly affects performance, especially in challenging scenarios such as in forensics. Furthermore, existing methods suffer from scalability issues, which prevents them from taking advantage of increasingly large amounts of training data. This is otherwise a promising approach to improve accuracy in such challenging scenarios. In this thesis we address these problems of mismatch and complexity by developing scalable probabilistic models that we apply to face, speaker and bimodal recognition. Our contributions are four-fold. First, we propose a unified framework for session variability modeling techniques based on Gaussian mixture models (GMM), that encompasses inter-session variability (ISV) modeling, joint factor analysis (JFA) and total variability (TV) modeling. Second, we propose a novel exact and scalable formulation of probabilistic linear discriminant analysis (PLDA), which is a probabilistic and generative framework that models between-class and within-class variations. This formulation solves a major scalability issue, by improving both the time complexity of the training procedure from cubic to linear with respect to the number of samples per class, and the complexity of the scoring procedure. Furthermore, the implementations of all the proposed techniques are integrated into a novel collaborative open source software library called Bob 1 that enforces fair evaluations and encourages reproducible research. Fourth and finally, large-scale experiments are conducted with all of the above machine learning algorithms on several databases such as FRGC for face recognition, NIST SRE12 for speaker recognition and MOBIO for bimodal recognition, showing competitive performances.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.