Abstract
In this modern era, language has no geographic boundary. Therefore, for developing an automated system for search engines using audio, tele-medicine, emergency service via phone etc., the first and foremost requirement is to identify the language. The fundamental difficulty of automatic speech recognition is that the speech signals vary significantly due to different speakers, speech variation, language variation, age and sex wise voice modulation variation, contents and acoustic conditions and so on. In this paper, we have proposed a deep learning based ensemble architecture, called FuzzyGCP, for spoken language identification from speech signals. This architecture combines the classification principles of a Deep Dumb Multi Layer Perceptron (DDMLP), Deep Convolutional Neural Network (DCNN) and Semi-supervised Generative Adversarial Network (SSGAN) to increase the precision to maximum and finally applies Ensemble learning using Choquet integral to predict the final output, i.e., the language class. We have evaluated our model on four standard benchmark datasets comprising of two Indic language datasets and two foreign language datasets. Irrespective of the languages, the F1-score of the proposed language identification model is as high as 98% in MaSS dataset and worst performance is that of 67% on the VoxForge dataset which is much better compared to maximum of 44% by state-of-the-art models on multi-class classification. The link to the source code of our model is available here.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.