Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Mengqiu Wang,Wanxiang Che,Christopher Manning

doi:10.1609/aaai.v27i1.8617

Abstract

Most semi-supervised methods in Natural Language Processing capitalize on unannotated resources in a single language; however, information can be gained from using parallel resources in more than one language, since translations of the same utterance in different languages can help to disambiguate each other. We demonstrate a method that makes effective use of vast amounts of bilingual text (a.k.a. bitext) to improve monolingual systems. We propose a factored probabilistic sequence model that encourages both crosslanguage and intra-document consistency. A simple Gibbs sampling algorithm is introduced for performing approximate inference. Experiments on English-Chinese Named Entity Recognition (NER) using the OntoNotes dataset demonstrate that our method is significantly more accurate than state-ofthe- art monolingual CRF models in a bilingual test setting. Our model also improves on previous work by Burkett et al. (2010), achieving a relative error reduction of 10.8% and 4.5% in Chinese and English, respectively. Furthermore, by annotating a moderate amount of unlabeled bi-text with our bilingual model, and using the tagged data for uptraining, we achieve a 9.2% error reduction in Chinese over the state-ofthe- art Stanford monolingual NER system.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence

Lead the way for us

Journal: Proceedings of the AAAI Conference on Artificial Intelligence	Publication Date: Jun 30, 2013
Citations: 44

Similar Papers

Development of a Mandarin-English Bilingual Speech Recognition System for Real World Music Retrieval
Q Zhang ... J Pan
IEICE Transactions on Information and Systems | VOL. E91-D
Q Zhang, et. al.Q Zhang ... J Pan
01 Mar 2008
IEICE Transactions on Information and Systems | VOL. E91-D

Slips of the Tongue
Nanda Poulisse
-
Nanda PoulisseNanda Poulisse
15 Nov 1999
15 Nov 1999

Automatically Inducing a Part-of-Speech Tagger by Projecting from Multiple Source Languages Across Aligned Corpora
Victoria Fossum ... Steven Abney
-
Victoria Fossum, et. al.Victoria Fossum ... Steven Abney
01 Jan 2004
01 Jan 2004

Mandarin-English bilingual Speech Recognition for real world music retrieval
Qingqing Zhang ... Yonghong Yan
-
Qingqing Zhang, et. al. Qingqing Zhang ... Yonghong Yan
01 Mar 2008
01 Mar 2008

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Effective Bilingual Constraints for Semi-Supervised Learning of Named Entity Recognizers

Abstract

Talk to us

Similar Papers

More From: Proceedings of the AAAI Conference on Artificial Intelligence