Abstract
One-dimensional Mel-Frequency Cepstrum Coefficients (1D-MFCC) in conjunction with a power spectrum analysis method is usually used as a feature extraction in a speaker identification system. However, as this one dimensional feature extraction subsystem shows low recognition rate for identifying an utterance speech signal under harsh noise conditions, we have developed a speaker identification system based on two-dimensional Bispectrum data that was theoretically more robust to the addition of Gaussian noise. As the processing sequence of ID-MFCC method could not be directly used for processing the two-dimensional Bispectrum data, in this paper we proposed a 2D-MFCC method as an extension of the 1D-MFCC method and the optimization of the 2D filter design using Genetic Algorithms. By using the 2D-MFCC method with the Bispectrum analysis method as the feature extraction technique, we then used Hidden Markov Model as the pattern classifier. In this paper, we have experimentally shows our developed methods for identifying an utterance speech signal buried with various levels of noise. Experimental result shows that the 2D-MFCC method without GA optimization has a comparable high recognition rate with that of 1D-MFCC method for utterance signal without noise addition. However, when the utterance signal is buried with Gaussian noises, the developed 2D-MFCC shows higher recognition capability, especially, when the 2D-MFCC optimized by Genetics Algorithms is utilized.
Highlights
Research on automatic speech and voice identification system has attracted much interest in the last few years, motivated by the growth of its applications in many areas such as in diagnosis of a rotor crack [1], classification of unknown radar targets [2], medical disease [3], and for personal and gender identification for security system [4,5]
As the processing sequence of ID-Mel-Frequency Cepstrum Coefficients (MFCC) method could not be directly used for processing the two-dimensional Bispectrum data, in this paper we proposed a 2D-MFCC method as an extension of the 1D-MFCC method and the optimization of the 2D filter design using Genetic Algorithms
We have developed the 2D-MFCC feature extraction method for processing the Bispectrum data from utterance speech signal
Summary
Research on automatic speech and voice identification system has attracted much interest in the last few years, motivated by the growth of its applications in many areas such as in diagnosis of a rotor crack [1], classification of unknown radar targets [2], medical disease [3], and for personal and gender identification for security system [4,5]. Speaker based personal identification is the process of determining a registered speaker when an utterance speech signal is provided. In this machine-based speech identification, a gallery of speeches is firstly enrolled to the system and coded for subsequent searching. When an unidentified speech is fetched to the system, a thoroughly comparison with the each coded speech in the gallery, and the identification is accomplished when a suitable match occurs. The main function of a feature extraction subsystem is to transform the input utterance speech signal into a set of features, while a classifier subsystem have to identify and classify the speaker by comparing the extracted-features from his/her speech signal input with the ones from a set of known speakers database
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.