基于自适应高斯混合模型与静动态听觉特征融合的说话人识别

吴迪 Wu Di,曹洁 Cao Jie,王进花 Wang Jin-Hua

doi:10.3788/ope.20132106.1598

Abstract

By optimizing the feature vectors and Gaussian Mixture Models(GMMs),a hybrid compensation method in model and feature domains is proposed.With the method,the speaker recognition features effected by the noise and the declined performance of GMM with reducing length of the training data under different unexpected noise environments are improved.By emulating human auditory,Gammatone Filter Cepstral Coefficients(GFCC) is given out based on Gammatone Filter bank models.As the GFCC only reflects the static properties,the Gammatone Filter Shifted Delta Cepstral Coefficients(GFSDCC) is extracted based on Shifted Delta Cepstral.Then,the adaptive process for each GMM model with sufficient training data is transformed to the shift factor based on factor analysis.Furthermore,when the training data are insufficient,the coordinate of the shift factor is learned from the GMM mixtures of insensitive to the training data and then it is adapted to compensate other GMM mixtures.The experiment result shows that the recognition rate of the method proposed is 98.46%.The conclusion is that the performance of speaker recognition system is improved under several kinds of noise environments.

Full Text