Abstract

Speech uttered by the human beings contains the information about speakers, languages and contents. Language of uttered speech can easily be identified by extracting the language specific information from it. Identification of language of speech is known as Language Identification (LID). Identification of language from speech is helpful in its translation, speech recognition and speech activated automatic systems. LID system may also play an important role in speaker recognition as identification of language can be used to reduce search space. In this paper an approach based on Linear Predictive Coding (LPC) and Mel Frequency Cepstral Coefficients (MFCCs) features for language identification is proposed using SVM and Random Forest (RF) classification techniques. Both LPC and MFCC features are vocal tract features. LPC and MFCC features extracted from uttered speech contain language as well as speaker related informations. Identification of language highly depends upon extraction of language specific features. Both these vocal tract parameters of speech contain lot of information about languages spoken compared to other parameters like excitation source parameters and prosodic parameters. Hence combination of these features performs better than individual. Experiments have been performed on the database obtained from IIIT-Hyderabad consisting of 5000 multilingual clean speech signals (Hindi, Bengali, Telugu, Tamil, Marathi and Malayalam). For training the proposed model, 600 speech signals are taken arbitrarily from the above database. Language model are created for each language. Evaluation of the proposed models has been made using other 300 speech signals from same database. Language models are evaluated using individual features as well as combined features. Experiments performed by taking both features at a time give better result as compared to taking individual features one at a time. Using these features, the accuracy of language identification is not more than 80% so far as claimed by other researchers. In the proposed approach, the accuracy of language identification is improved to 92.6% using combination of same features and random forest model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call