Abstract

Speaker Identification (SI) pertains to the method of understanding person voice by utilizing the techniques of machine learning algorithms. Extracting the feature information from speaker utterances is an essential activity in the speaker identification process to classify the speakers accurately. In many speaker identification systems Mel Frequency Cepstral Coefficients (MFCC) and Dynamic Mel Frequency Cepstral Coefficients (DMFCC) are used as features because of its potential to represent repetitive nature of speech signal. This work aims to consider the MFCC and DMFCC coefficient which is used to improve the recognition of speaker systems in precision. To construct the speaker identity model, the extracted MFCC and DMFCC features were fed into a Gaussian Mixture Model (GMM) and Bayesian Classifier as input. The Performance of GMM and Bayesian classifier is analyzed for different number of mixtures. The GNN Model can reach maximum accuracy of 82.7% for MFCC and 80.12% for DMFCC. On the other hand the Bayesian classifier performance is 79.43 for MFCC and 83.92% for DMFCC. To classify a speaker, the extraction of features and the classification model can be applied extensively to different types of speaker datasets.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.