A pseudo-task design in multi-task learning deep neural network for speaker recognition

Xugang Lu,Hisashi Kawai,Yu Tsao,Peng Shen

doi:10.1109/iscslp.2016.7918433

Abstract

Deep neural network (DNN) technique has been successfully applied in image and automatic speech recognition (ASR) fields due to its efficiency to learn robust and invariant features for pattern recognition. Recently, it has been applied to learn speaker specific acoustic features for speaker recognition. However, due to large number of parameters to be optimized in DNN model and limited training data, it is easy for the DNN to be overfitting with a local solution. The overfitted model has a weak generalization ability on testing data sets. Multitask learning framework is proved to improve DNN model generalization, but how to design a secondary task for speaker recognition is not a trivial work. In this study, under the multi-task learning framework, besides the main task with speaker ID labels in training, a pseudo-task as an auxiliary task was designed with labels obtained from an unsupervised learning algorithm. Speaker recognition experiments were carried out to test the proposed multi-task learning DNN model. The results showed that adding the pseudo-task in the multi-task learning, the performance of the main task was improved.

Full Text