Compressed high dimensional features for speaker spoofing detection

Yuanjun Zhao,Victor Sreeram,Roberto Togneri

doi:10.1109/apsipa.2017.8282108

Abstract

The vulnerability in Automatic Speaker Verification (ASV) systems to spoofing attacks such as speech synthesis (SS) and voice conversion (VC) has been recently proved. High- dimensional magnitude and phase based features possess outstanding spoofing detection performance but are not compatible with the Gaussian Mixture Model (GMM) classifiers which are commonly deployed in speaker recognition systems. In this paper, a Compressed Sensing (CS) framework is initially combined with high-dimensional (HD) features and a derived CS-HD based feature is proposed. A standalone spoofing detector assembled with the GMM classifier is evaluated on the ASVspoof 2015 database. Two ASV systems integrated with the spoofing detector are also tested. For the separate detector, an equal error rate (EER) of 0.01% and 5.35% are reached on the evaluation set for known attack and unknown attack, respectively. While for the ASV systems, the best EERs of 0.02% and 5.26% are achieved. The proposed CS-HD feature can obtain similar results with lower dimension than other systems. This suggests that the verification system can be made more computationally efficient.

Full Text