Abstract

Short utterance and background noise represent great challenging for speaker verification due to the mismatch and limited training and/or retrieve data. A remarkable performance using matched training and testing conditions generally could be achieved in automatic speaker verification. However, mismatched noisy and short utterances conditions attend to drop the results significantly. Furthermore, the performance is significantly affected by the features extraction. The most common features in this field of the study are Mel-Frequency Cepstral Coefficients (MFCCs). With a noise presents in the background and short utterances, MFCC performance could not be reliable without a support feature. To address this, a new feature ‘Entrocy’ for accurate and robust speaker verification under limited data and noisy environments is proposed and employed to support MFCC coefficients. Entrocy feature represents the Fourier Transform of the Entropy that calculates the fluctuation of the information in the sound segments over time. The resulting Entrocy features are combined with MFCC functionality to generate a composite feature, which is tested using the Gaussian Mixture Model (GMM) recognition method. The suggested method was conducted out over a range of signal/noise ratios and utterances were truncating into shorts (2, 3, 4, 5, 6, 8, and 10s) for verification. The proposed method has shown strong robustness in the challenging of background noise and limited testing data and they consistently perform better than the well-known MFCC.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call