Abstract

Speech recognition systems exhibit performance degradation due to variability in speech caused by the accents or dialects of speakers. This can be overcome by correctly identifying the accent or dialect of the speaker and using accent or dialect information to adapt speech recognition systems. In this paper, we apply extreme learning machines (ELMs) and support vector machines (SVMs) to the problem of accent/dialect classification on the TIMIT dataset. We used Mel frequency cepstrum coefficients (MFCCs) and the normalized energy parameter along with their first and second derivatives as raw features for training ELMs and SVMs. A weighted accent classification algorithm is proposed that uses a novel architecture to classify North American accents into seven groups. Using this algorithm, we obtained a classification accuracy of 77.88% using ELMs, which to our knowledge, is the best result reported for accent classification on the TIMIT dataset. We also compared the performance of ELMs with SVMs as classifiers for our weighted accent classification algorithm and with multi-class classification using ELMs or SVMs.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.