English Language Accent Classification and Conversion using Machine Learning

Pratik Parikh,Ketaki Velhal,Aayushi Sikligar,Sanika Potdar,Ruhina Karani

doi:10.2139/ssrn.3600748

Abstract

Language is considered an integral part of human communication. Sometimes, a non-native speaker who is as fluent as the native speaker may not be comprehended suitably because of dissimilitude in accents. Also, detecting a person’s accent itself poses an arduous task due to subtle heterogeneity in pronunciation. In order to curtail these communication barriers arising because of accents distinctness, we have proposed a system to detect and convert speech which can conveniently differentiate one accent from another which has achieved an accuracy of 68.67%. The main motivation was solving the difficulty of Indians to understand the foreign accents and of foreigners to understand the Indian accent. The scope of this project would be limited to the English language. Detection of accent would involve identifying the speaker’s native language. We introduced a novel method of augmenting the traditionally exercised techniques by using Convolutional Neural Network (CNN) fused with Deep Neural Network (DNN) along with Recurrent Neural Networks (RNN) to subsequently enhance the accuracy. For accent conversion, audio features mainly consist of Mel-Frequency Cepstral Coefficients (MFCCs), Aperiodicity (AP) and Fundamental Frequency (F0). These features once extracted are processed by a Generative Adversarial Network (GAN) to convert source accent to target accent.

Full Text