In speaker recognition tasks, one of the reasons for reduced accuracy is due to closely resembling speakers in the acoustic space. In order to increase the discriminative power of the classifier, the system must be able to use only the unique features of a given speaker with respect to his/her acoustically resembling speaker. This paper proposes a technique to reduce the confusion errors, by finding speaker-specific phonemes and formulate a text using the subset of phonemes that are unique, for speaker verification task using i-vector based approach. In this paper spectral features such as linear prediction cepstral co-efficients (LPCC), perceptual linear prediction co-efficients (PLP) and phase feature such as modified group delay are experimented to analyse the importance of speaker-specific-text in speaker verification task. Experiments have been conducted on speaker verification task using speech data of 50 speakers collected in a laboratory environment. The experiments show that the equal error rate (EER) has been decreased significantly using i-vector approach with speaker-specific-text when compared to i-vector approach with random-text using different spectral and phase based features.
Read full abstract