Multi-task learning of deep neural networks for joint automatic speaker verification and spoofing detection

Jiakang Li,Xiongwei Zhang,Meng Sun

doi:10.1109/apsipaasc47483.2019.9023289

Abstract

With the development of spoofing technologies, automatic speaker verification (ASV) systems have encountered serious challenges on security. In order to address this problem, many anti-spoofing countermeasures have been explored. There are two intuitive recipes to protect an ASV system from spoofing. The first one is to use a cascaded structure where spoofing detection is performed firstly and ASV is subsequently conducted only on the attempts which have passed the spoofing detection. The other one is to perform spoofing detection and ASV jointly. The discriminate reliably of the joint system has been proven to be more advantageous than cascaded systems with traditional methods, not only in accuracy, but also in convenience and computational efficiency. In this paper, we proposed a multi-task learning approach based on deep neural network to make a joint system of ASV and anti-spoofing. The performance of different acoustic features and structures of deep neural networks has been investigated on the ASVspoof 2017 version 2.0 dataset. The experimental results showed that the joint equal error rate (EER) of our approach was reduced by 0.55% compared to a joint system with Gaussian back-end fusion baseline.

Full Text