Abstract

In this paper, two automatic language identification (LID) systems are compared. One of the systems is the Hidden Markov Model (HMM) based phonetic engine (PE), and the other is the Gaussian Mixture Model based Universal Background Model (GMM–UBM) classifier. The PE belongs to the category of explicit LID systems while the GMM–UBM classifier falls into the category of implicit LID systems. Ideally, explicit LID requires a segmented and phonetically labelled speech corpus, while the implicit LID systems do not require any phonetic labelling of the data. Both systems are tested here in identifying a set of data belonging to three Indian languages, Manipuri, Assamese and Bengali. The selection of these languages is made due to their wide range of usages in North Eastern India, while at the same time; no proper identification task has been reported so far for a database containing these languages together. The purpose of this comparison is to check the LID efficiency of the relatively new concept of PE with a prevalent identification technique GMM–UBM. In the experiments, it is found that the identification rate (IDR) is more with the PE than that of GMM–UBM system. The average IDR reported with PE is 99% while for the GMM–UBM system it is found to be 96.94% with the same speech corpus being in use. However, the data preparation task is a little more cumbersome and expensive in PE than that of GMM–UBM system. Thus, the compensation for accuracy may be paid with the cost incurred in data preparation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call