Statistical Evaluation of Biometric Evidence in Forensic Automatic Speaker Recognition

Andrzej Drygajlo

doi:10.1007/978-3-642-03521-0_1

Abstract

Forensic speaker recognition is the process of determining if a specific individual (suspected speaker) is the source of a questioned voice recording (trace). This paper aims at presenting forensic automatic speaker recognition (FASR) methods that provide a coherent way of quantifying and presenting recorded voice as biometric evidence. In such methods, the biometric evidence consists of the quantified degree of similarity between speaker-dependent features extracted from the trace and speaker-dependent features extracted from recorded speech of a suspect. The interpretation of recorded voice as evidence in the forensic context presents particular challenges, including within-speaker (within-source) variability and between-speakers (between-sources) variability. Consequently, FASR methods must provide a statistical evaluation which gives the court an indication of the strength of the evidence given the estimated within-source and between-sources variabilities. This paper reports on the first ENFSI evaluation campaign through a fake case, organized by the Netherlands Forensic Institute (NFI), as an example, where an automatic method using the Gaussian mixture models (GMMs) and the Bayesian interpretation (BI) framework were implemented for the forensic speaker recognition task.KeywordsGaussian Mixture ModelSpeaker RecognitionRecorded SpeechControl RecordingForensic ExpertThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Full Text