Abstract

In this paper, we present the details of a phonotactic language identification (LID) system developed for five Indian languages, English (Indian), Hindi, Malayalam, Tamil and Kan-nada. Since there are no publicly available speech databases for English, Malayalam and Kannada, we developed the database for each of the target languages by downloading the audio files from YouTube videos and removing the non-speech signals manually. The system was tested using a test data set consisting of 40 utterances with duration of 30, 10, and 3 sees, in each of 5 target languages. The performance evaluation was done separately accordingly to the NIST benchmarking sessions, for 30s, 10s and 3s segments separately. For the baseline system, we got an overall EER of 10.41 %, 19.56 % and 31.45 % for 30, 10, and 3 sees segments when tested with a 3-gram language model. The use of 4-gram language model has helped enhance the performance of the LID system to 9.81 %, 19.38 % and 32.77% respectively for 30,10 and 3 sees test segments. Further, by using the n-gram smoothing, we were able to improve the EER of the LID system, 9.02 %, 18.70 % and 29.24 % for 3-gram language models and 8.88 %, 16.46 % and 32.03 % for 4-gram language models, respectively for 30,10, and 3 sec test segments. The study shows that the use of 4-gram language models can help enhance the performance of LID systems for Indian languages.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.