A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging

Ankur Priyadarshi,Sujan Kumar Saha

doi:10.1145/3540260

Abstract

This article presents our effort in developing a Maithili Part of Speech (POS) tagger. Substantial effort has been devoted to developing POS taggers in several Indian languages, including Hindi, Bengali, Tamil, Telugu, Kannada, Punjabi, and Marathi; but Maithili did not achieve much attention from the research community. Maithili is one of the official languages of India, with around 50 million native speakers. So, we worked on developing a POS tagger in Maithili. For the development, we use a manually annotated in-house Maithili corpus containing 56,126 tokens. The tagset contains 27 tags. We train a conditional random fields (CRF) classifier to prepare a baseline system that achieves an accuracy of 82.67%. Then, we employ several recurrent neural networks (RNN)-based models, including Long-short Term Memory (LSTM), Gated Recurrent Unit (GRU), LSTM with a CRF layer (LSTM-CRF), and GRU with a CRF layer (GRU-CRF) and perform a comparative study. We also study the effect of both word embedding and character embedding in the task. The highest accuracy of the system is 91.53%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: Feb 21, 2023
Citations: 2

Similar Papers

Design and Develop A Part of Speech Tagging for Ge’ez Language using Deep Learning Approach
Asnak Yihunie Kassahun ... Tessfu Geteye Fantaye
-
Asnak Yihunie Kassahun, et. al.Asnak Yihunie Kassahun ... Tessfu Geteye Fantaye
28 Nov 2022
28 Nov 2022

A Deep Learning Approach to Malayalam Parts of Speech Tagging
M K Junaida ... Anto P Babu
-
M K Junaida, et. al.M K Junaida ... Anto P Babu
01 Jan 2020
01 Jan 2020

The first named entity recognizer in Maithili: Resource creation and system development
Ankur Priyadarshi ... Sujan Kumar Saha
Journal of Intelligent & Fuzzy Systems | VOL. 41
Ankur Priyadarshi, et. al.Ankur Priyadarshi ... Sujan Kumar Saha
11 Aug 2021
Journal of Intelligent & Fuzzy Systems | VOL. 41

Deep Learning based Part-of-Speech tagging for Assamese using RNN and GRU
Kuwali Talukdar ... Shikhar Kumar Sarma
Procedia Computer Science | VOL. 235
Kuwali Talukdar, et. al.Kuwali Talukdar ... Shikhar Kumar Sarma
01 Jan 2024
Procedia Computer Science | VOL. 235

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

A Study on the Performance of Recurrent Neural Network based Models in Maithili Part of Speech Tagging

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing