Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning

Rudra Murthy,Mitesh M Khapra,Pushpak Bhattacharyya

doi:10.1145/3238797

Abstract

Existing supervised solutions for Named Entity Recognition (NER) typically rely on a large annotated corpus. Collecting large amounts of NER annotated corpus is time-consuming and requires considerable human effort. However, collecting small amounts of annotated corpus for any language is feasible, but the performance degrades due to data sparsity. We address the data sparsity by borrowing features from the data of a closely related language. We use hierarchical neural networks to train a supervised NER system. The feature borrowing from a closely related language happens via the shared layers of the network. The neural network is trained on the combined dataset of the low-resource language and a closely related language, also termed Multilingual Learning. Unlike existing systems, we share all layers of the network between the two languages. We apply multilingual learning for NER in Indian languages and empirically show the benefits over a monolingual deep learning system and a traditional machine-learning system with some feature engineering. Using multilingual learning, we show that the low-resource language NER performance increases mainly due to (1) increased named entity vocabulary, (2) cross-lingual subword features, and (3) multilingual learning playing the role of regularization.

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing

Lead the way for us

Journal: ACM Transactions on Asian and Low-Resource Language Information Processing	Publication Date: Dec 14, 2018
Citations: 14

Similar Papers

Chinese Named Entity Recognition Based on B-LSTM Neural Network with Additional Features
Liubo Ouyang ... Yuan Tian
-
Liubo Ouyang, et. al.Liubo Ouyang ... Yuan Tian
01 Jan 2017
01 Jan 2017

Representing raw linguistic information in chinese text-to-speech system
Minghui Dong ... Zhengchen Zhang
-
Minghui Dong, et. al.Minghui Dong ... Zhengchen Zhang
01 Dec 2017
01 Dec 2017

Named Entity Recognition for a Low Resource Language
Abhijit Debbarma* ... Dr Paritosh Bhattacharya
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8
Abhijit Debbarma*, et. al.Abhijit Debbarma* ... Dr Paritosh Bhattacharya
30 Sep 2019
International Journal of Recent Technology and Engineering (IJRTE) | VOL. 8

Wasserstein Cross-Lingual Alignment For Named Entity Recognition
Rui Wang ... Ricardo Henao
-
Rui Wang, et. al.Rui Wang ... Ricardo Henao
23 May 2022
23 May 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improving NER Tagging Performance in Low-Resource Languages via Multilingual Learning

Abstract

Talk to us

Similar Papers

More From: ACM Transactions on Asian and Low-Resource Language Information Processing