Abstract
Speaker verification (SV) is an important branch in speaker recognition. Several approaches have been investigated within the last few decades. In this context, deep learning has received much more interest by speech processing researchers, and it was introduced recently in speaker recognition. In most cases, deep learning models are adapted from speech recognition applications and applied to speaker recognition, and they have been showing their capability of being competitors to the state-of-the-art approaches. Nevertheless, the use of deep learning in speaker recognition is still linked to speech recognition. In this study, we are proposing a new way to use deep neural networks (DNNs) in speaker recognition, in the purpose to facilitate to DNN to learn features distribution. We have been motivated by our previous work, where we have proposed a novel scoring method that works perfectly with clean speech, but it needs improvements under noisy conditions. For this reason, we are aiming to transform the extracted feature vectors (MFCCs) into enhanced feature vectors, that we denote Deep Speaker Features (DeepSFs). Experiments have been conducted on THUYG-20 SRE corpus, and significant results have been achieved. Moreover, this new method outperformed both i-vector/PLDA and our baseline system in both clean and noisy conditions.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have