Multi-class identification of tonal contrasts in Chokri using supervised machine learning algorithms

Amalesh Gope,Anusuya Pal,Sekholu Tetseo,Tulika Gogoi,Joanna J,Dinkur Borah

doi:10.1057/s41599-024-03113-2

Abstract

This study examines and explores the effectiveness of various Machine Learning Algorithms (MLAs) in identifying intricate tonal contrasts in Chokri (ISO 639-3), an under-documented and endangered Tibeto-Burman language of the Sino-Tibetan language family spoken in Nagaland, India. Seven different supervised MLAs, viz., [Logistic Regression (LR), Decision Tree (DT), Random Forest (RF), Support Vector Machine (SVM), K-Nearest Neighbors (KNN), Naive Bayes (NB)], and one neural network (NN)-based algorithms [Artificial Neural Network (ANN)] are implemented to explore five-way tonal contrasts in Chokri. Acoustic correlates of tonal contrasts, encompassing fundamental frequency fluctuations, viz., f0 height and f0 direction, are examined. Contrary to the prevailing notion of NN supremacy, this study underscores the impressive accuracy achieved by the RF. Additionally, it reveals that combining f0 height and directionality enhances tonal contrast recognition for female speakers, while f0 directionality alone suffices for male speakers. The findings demonstrate MLAs’ potential to attain accuracy rates of 84–87% for females and 95–97% for males, showcasing their applicability in deciphering the intricate tonal systems of Chokri. The proposed methodology can be extended to predict multi-class problems in diverse fields such as image processing, speech classification, medical diagnosis, computer vision, and social network analysis.

Full Text