POS tagging is a basic Natural Language Processing (NLP) task that tags the words in an input text according to its grammatical values. Although POS Tagging is a fundamental application for very resourced languages, such as Limbu, is still unknown due to only few tagged datasets and linguistic resources. This research project uses deep learning techniques, transfer learning, and the BiLSTM-CRF model to develop an accurate POS-tagging system for the Limbu language. Using annotated and unannotated language data, we progress in achieving a small yet informative dataset of Limbu text. Skilled multilingual tutoring was modified to enhance success on low-resource language tests. The model as propose attains 90% accuracy, which is very much better than traditional rule-based and machine learning methods for Limbu POS tagging. The results indicate that deep learning methods can address linguistic issues facing low-resource languages even with limited data. In turn, this study provides a cornerstone for follow up NLP-based applications of Limbu and similar low-resource languages, demonstrating how deep learning can fill the gap where data is scarce.
Read full abstract