Abstract

Ao is a language spoken in Nagaland in the North-East of India. It is a low-resource tone language under the Tibeto-Burman language family. It consists of three tones, namely, high, mid and low. It has three distinct dialects of the language viz. Chungli, Mongsen and Changki. This paper presents an automatic dialect identification in Ao using the excitation source feature. The objective of a dialect identification system is to identify a speech variety within a language. The goal of this study is to determine if the excitation source features such as Residual Mel Frequency Cepstral Coefficient (RMFCC) can be exploited to discriminate the three dialects in Ao automatically. In addition, vocal tract system features, namely Mel Frequency Cepstral Coefficients (MFCC) and Shifted Delta Cepstral (SDC) coefficients, are used as the baseline methods. The RMFCC features are obtained from the Linear Prediction (LP) residual signal, while MFCC features are derived from the smooth spectrum of the speech signal. SDC coefficients are explored to provide additional temporal information. This work is evaluated on trisyllabic words uttered by 36 speakers for the three dialects of Ao. A Gaussian Mixture Model (GMM) based classifier is used for classification. The performance of the system yields a better dialect identification accuracy rate when all three features are combined.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call