Speaker Accent Recognition Using Machine Learning Algorithms

Ahmet Aytug Ayranci,Tulay Yildirim,Sergen Atay

doi:10.1109/asyu50717.2020.9259902

Abstract

Speaker recognition is a system that recognizes the speaker from the recorded voice signal. Speech and speaker recognition are important for many areas like online banking, telephone shopping, and security applications. In order to analyze and verify speech and speaker, Machine Learning (ML) algorithms can be used. With enough data it is possible to train a program to identify speech and speaker identity. In this paper, several ML algorithms used to identify speaker accents. Data set includes 329 speakers with 6 different accents and English and these words have been converted to metric representation using Mel-Frequency Cepstral Coefficients (MFCCs). MFCC is the most used technique in speech recognition because of the high performance of feature extraction performance and this data set utilizes MFCC to convert speech to data. In this study, 7 classification type ML algorithms used, including Multilayer Perceptron (MLP), Random Forest (RF), Decision Tree (DT), Radial Basis Function (RBF), k-Nearest Neighbor (k-NN), Naive Bayes (NB) and Logistic Model Tree (LMT) methods used for Speaker Accent Recognition data set on UCI ML Repository. Performance metrics, compared using accuracy, Kappa Statistics, Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE), have been acquired and compared for each algorithm.

Full Text