Performance Of Speaker Recognition System Research Articles

The performance of speaker recognition system is highly dependent on the duration of speech used in enrollment and test. This work presents a detailed experimental review and analysis of the GMM-SVM based speaker recognition system in presence of duration variability. This article also reports a comparison of the performance of GMM-SVM classifier with its precursor technique Gaussian mixture model- universal background model (GMM-UBM) classifier in presence of duration variability. The goal of this research work is not to propose a new algorithm for improving speaker recognition performance in presence of duration variability. However, the main focus of this work is on utterance partitioning (UP), a commonly used strategy to compensate the duration variability issue. We have analysed in detailed the impact of training utterance partitioning in speaker recognition performance under GMM-SVM framework. We further investigate the reason why the utterance partitioning is important for boosting speaker recognition performance. We have also shown in which case the utterance partitioning could be useful and where not. Our study has revealed that utterance partitioning does not reduce the data imbalance problem of the GMM-SVM classifier as claimed in earlier study. Apart from these, we also discuss issues related to the impact of parameters such as number of Gaussians, supervector length, amount of splitting required for obtaining better performance in short and long duration test conditions from speech duration perspective. We have performed the experiments with telephone speech from POLYCOST corpus consisting of 130 speakers.

Read full abstract

By optimizing the feature vectors and Gaussian Mixture Models(GMMs),a hybrid compensation method in model and feature domains is proposed.With the method,the speaker recognition features effected by the noise and the declined performance of GMM with reducing length of the training data under different unexpected noise environments are improved.By emulating human auditory,Gammatone Filter Cepstral Coefficients(GFCC) is given out based on Gammatone Filter bank models.As the GFCC only reflects the static properties,the Gammatone Filter Shifted Delta Cepstral Coefficients(GFSDCC) is extracted based on Shifted Delta Cepstral.Then,the adaptive process for each GMM model with sufficient training data is transformed to the shift factor based on factor analysis.Furthermore,when the training data are insufficient,the coordinate of the shift factor is learned from the GMM mixtures of insensitive to the training data and then it is adapted to compensate other GMM mixtures.The experiment result shows that the recognition rate of the method proposed is 98.46%.The conclusion is that the performance of speaker recognition system is improved under several kinds of noise environments.

Read full abstract

Performance Of Speaker Recognition System Research Articles

Related Topics

Articles published on Performance Of Speaker Recognition System

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework

Performance of speaker recognition system using shifted mfcc, delta spectral cepstral coefficient (DSCC) and Fuzzy techniques

Cost-sensitive learning for emotion robust speaker recognition.

Speaker recognition method based on Mel frequency cepstrum coefficient and inverted Mel frequency cepstrum coefficient

基于自适应高斯混合模型与静动态听觉特征融合的说话人识别

Research on Robust Speaker Identification Based on Adaptive Histogram Equalization

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Performance Of Speaker Recognition System Research Articles

Related Topics

Articles published on Performance Of Speaker Recognition System

Utterance partitioning for speaker recognition: an experimental review and analysis with new findings under GMM-SVM framework

Performance of speaker recognition system using shifted mfcc, delta spectral cepstral coefficient (DSCC) and Fuzzy techniques

Cost-sensitive learning for emotion robust speaker recognition.

Speaker recognition method based on Mel frequency cepstrum coefficient and inverted Mel frequency cepstrum coefficient

基于自适应高斯混合模型与静动态听觉特征融合的说话人识别

Research on Robust Speaker Identification Based on Adaptive Histogram Equalization