Performance comparison of speaker recognition systems using GMM and i-Vector methods with PNCC and RASTA PLP features

P K Nayana,Abraham Thomas,Dominic Mathew

doi:10.1109/icicict1.2017.8342603

Abstract

Biometrie recognition techniques using features such as iris, fingerprint, face etc. are nowadays being popularly used for the purpose of verification or authentication of individuals. Speech is one of the most natural forms of communication among humans and it has been proven that voice biometric also provides a unique identity measure of a person. Features extracted from speech represent speaker specific characteristics which help us to uniquely identify a person. Speaker recognition is now being employed in various applications such as transaction authentication, forensic purposes etc. In this paper, the performances of Text-Independent Speaker Recognition (SR) systems implemented using Gaussian Mixture Models (GMM) and i-Vector method with Probabilistic Linear Discriminant Analysis (PLDA) classifier are compared. Both systems have been realized with two types of features namely, Power Normalized Cepstral Coefficients (PNCC) and Relative Spectral Perceptual Linear Prediction (RASTA PLP) coefficients. It has been seen that PNCC features provide a closer approximation of human auditory system than RASTA PLP. The results also show that the performance of SR system is better with GMM for short utterances and is better for i-vector method for longer utterances.

Full Text