Abstract

Automatic Language identification is important for a multilingual country like India. The work described in this paper deals with Language identification system (LID) for three Indian languages-Hindi, Telugu and Urdu, using Mel Frequency Cepstral Coefficient (MFCC) features. In this work, Universal Background Model (UBM) is built using MFCC features taken from all the languages which make it language and speaker independent. Two cases are considered here. First one is a gender independent Gaussian mixture model-Universal background model (GMM-UBM) and the second one is gender dependent model. Higher accuracy of 83.3% is obtained for gender dependent LID system compared to gender independent model.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call