Abstract

Over the last few decades, the field of artificial intelligence and machine learning has evolved. Due to the advancement in these fields, much work has been done to assist language learning with the help of computers called Computer-Assisted Language Learning (CALL). Mispronunciation detection is one of the significant tasks of the CALL system. An efficient mispronunciation detection model has a positive impact on the life of second language learners by providing phoneme level feedback. In this paper, we introduce the phone grouping technique for mispronunciation detection that is based on mistakes probability. We consider mispronunciation detection as a classification problem, traditionally for this purpose, a separate classifier is trained for each phoneme mistake that requires a lot of memory and time. Instead of training a separate classifier, we group the phoneme based on their mistakes probability that helps in reducing the number of the classifiers to be trained and also saves memory and time. We use the Support Vector Machine (SVM) classifier and test the results on the Arabic dataset (28 Phonemes). The performance of our proposed method is evaluated by using accuracy. The results of the model are evaluated using the confusion matrix and gives an accuracy of 88%. Our approach outperforms the existing systems developed for Arabic phonemes in terms of accuracy and is also time/memory efficient.

Highlights

  • Due to advancements in technology world has become a global village

  • There are many tasks performed by these Computer-Assisted Language Learning (CALL) systems which include automatic speech recognition, pronunciation scoring, and mispronunciation detection

  • Root Mean Square (RMS) value represents the average power of a signal and it is related to the amplitude of a signal

Read more

Summary

INTRODUCTION

Due to advancements in technology world has become a global village. People can communicate with one another living in different parts of the world, so there is an increasing demand for new language learning [1]. Speech technology has improved dramatically over the last decade, so by using speech technology and machine learning techniques, many intelligent CALL systems are developed which are more useful and intelligent than ever These systems detect pronunciation mistakes of a learner and provide feedback [3]. Mispronunciation detection requires calculating the pronunciation scores on the local level which is usually phoneme level. We propose a classifier-based approach (1) as for mispronunciation detection of Arabic phonemes. Hindi different classifiers (decision trees, random forest, et al [11] calculated the GOP score to identify gradient boosting, SVM with a linear kernel, SVM pronunciation mistakes in five Arabic phonemes that with radial basis function and Binomial logistic were frequently mispronounced by non-native Arabic regression) for classification and among those support speakers. Linear Discriminant Analysis or a decision tree for mispronunciation detection of three sounds

Deep-Learning based Methods
Features used for mispronunciation detection
Evaluation metrics
Feature Extraction
Roll-Off
Entropy
Zero-Crossing Rate
Energy features
Root Mean Square
Spectral Features
Data Pre-Processing
Feature Selection
Find n closest misses Mn from class Cl where
Grouping of Phonemes
Classifiers
SVM Classifier
Naïve Bayes
K-Nearest Neighbor
Dataset
Results and Discussions
Evaluation Metric
State of art comparison
Discussion
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call