Abstract

This paper proposes an automatic tonal and non-tonal language classification system for North East (NE) Indian languages using formants and prosodic features. The state-of-the-art system for tonal/non-tonal classification uses mostly prosodic features and considers the utterance-level analysis unit during feature extraction. To this end, the present work explores formants and studies if it has complimentary information with respect to prosody. It also analyzes different analysis units for feature extraction, namely syllable, di-syllable, word, and utterance. Classification techniques based on Gaussian mixture model—universal background model (GMM-UBM), neural network and i-vector have been explored in this work. The paper presents NIT Silchar language database (NITS-LD) prepared in-house to carry out experimental validation. It covers seven NE Indian languages and uses data from All India radiobroadcast news archives. Experimental analysis suggests that artificial neural network (ANN) based on syllable level features provides the lowest EERs of 31.8, 36 and 37.8% for test data of durations, 30, 10, and 3 s, respectively, when the combination of prosodic features and formants are used. The addition of formants helps to improve the system performance by up to 6.8, 7.8 and 9.2% for test data of the three different durations with respect to that of prosodic features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call