This article presents a method for automatically recognizing spoken digits in the Adi language using the highly effective open-source voice recognition toolkit, “Kaldi”. The study also includes an analysis of the formant frequencies and spectral characteristics of Adi digits. Adi is a zero-resource indigenous tribal language of Arunachal Pradesh with its roots in Tibeto-Burman. UNESCO's 2009 Atlas of the World's Languages in Danger classifies Adi as an endangered language in India. The research utilizes a modest digit corpus from 42 native Adi speakers, employing Mel Frequency Cepstral Coefficients (MFCC) and Perceptual Linear Prediction (PLP) features from continuous Adi speech utterances. The study explores both Monophone and Triphone models for digit recognition, with notable improvements in accuracy from 79.14% in the Monophone model to 84.29%, 88.43%, and 90.86% in the Triphone models (Tri1, Tri2, and Tri3). This proposed model holds promise for the development of Adi language speech recognition applications.
Read full abstract