Abstract

Each country has its characteristics and culture, one of these characteristics is the accent of speech. By listening to someone's accent, we can identify the country of origin of the speaker. Research on accent recognition includes Automatic Speech Recognition (ASR) Technology which is currently being developed, an example of ASR technology, namely Virtual Assistant, the development of this research can be more intelligent Virtual Assistant because it can provide an accent from a speaker. In this study, the authors tried to classify accents from various countries (5 classes), namely English, Spanish, Mandarin, French and Arabic. The dataset used in this study consists of English 627 audio, Spanish 220 audio, Mandarin 132 audio, French 80 audio, and Arabic 172 audio, where all sentences are the same sentence in English. In this study, the audio features used are Mel - Frequency Cepstral Coefficients (MFCC), Zero Crossing Rate (ZCR), and Energy (in librosa it is called RMS). Audio feature extraction generates an array of each audio, the result of audio feature extraction will be the Convolutional Neural Network (CNN) Method input for classifying the accent. This research resulted in 51.30% accuracy for the MFCC feature, 48.05% for the ZCR feature, and 51.95% for the Energy feature. The Energy feature gets good accuracy, followed by the MFCC and ZCR features.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.