Vulnerability of speaker verification systems against voice conversion spoofing attacks: The case of telephone speech

Tomi Kinnunen,Eng Siong Chng,Haizhou Li,Filip Sedlak,Zhi-Zheng Wu,Kong Aik Lee

doi:10.1109/icassp.2012.6288895

Abstract

Voice conversion - the methodology of automatically converting one's utterances to sound as if spoken by another speaker - presents a threat for applications relying on speaker verification. We study vulnerability of text-independent speaker verification systems against voice conversion attacks using telephone speech. We implemented a voice conversion systems with two types of features and nonparallel frame alignment methods and five speaker verification systems ranging from simple Gaussian mixture models (GMMs) to state-of-the-art joint factor analysis (JFA) recognizer. Experiments on a subset of NIST 2006 SRE corpus indicate that the JFA method is most resilient against conversion attacks. But even it experiences more than 5-fold increase in the false acceptance rate from 3.24 % to 17.33 %.

Full Text