Abstract

Language Identification (LID) is one of the most popular areas of research in speech signal processing. Now a day's lots of approaches have been used to improve performance of LID system which includes Parallel Phone Recognition Language Modeling (PPRLM), Support Vector Machine (SVM) and general Gaussian Mixture Model (GMM) etc. The state-of-art LID system has been utilised lots of feature vectors like LPCC, MFCC, SDC and prosodic. Although fusion of prosodic features with MFCC features shows some improvement in the performance of the LID system. But still it is not sufficient. In this paper, a baseline system for the LID system in multilingual environments has been developed using GMM as a classifier and MFCC combined with Shifted-Delta- Cepstral (SDC) as front end processing feature vectors. In this works, we used the Arunachali Language Speech Database (ALS-DB), a multilingual and multichannel speech corpus which was recently collected from the four local languages namely Adi, Apatani, Galo and Nyishi in Arunachal Pradesh including Hindi and English as secondary languages.The performance of the LID system has been improved by combing MFCC and SDC features than its individual performances. The minimum ERR rates for the features MFCC and SDC individually are 19.70% and 11.83% respectively while minimum ERR rate for the combined features both MFCC and SDC is 6.40%.Approximately 15.00% and 6.00% of performance of the LID system has been improved while using the combining features of MFCC with SDC over the baseline systems that using MFCC and SDC features in individual respectively.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.