Two-Phase Base Noun Phrase Alignment in Chinese-English Parallel Corpora

Jun Zhao Jun Zhao,Feifan Liu Feifan Liu,Dongming Liu Dongming Liu

doi:10.1109/nlpke.2005.1598762

Abstract

A two-phase approach of automatically aligning bilingual base noun phrases from sentence-aligned Chinese-English parallel corpus is proposed in this paper. We conduct alignment in two phases: one deals with high-frequency base noun phrases by statistical co-occurrence information between parallel corpus, and the other deals with low-frequency base noun phrases using the bilingual lexical information and Dice coefficient similarity metrics. This can be reasonably considered to acquire higher recall without degrading the precision on the whole. Furthermore, our approach can escape from complex Chinese parsing problems and don't need to recognize Chinese base noun phrases accurately before the aligning process. Also, it can also relieve, to some extent, the serious impacts of error spread which may result from the unstable and impractical Chinese base noun phrases extraction tools. Another, dealing with high frequency noun phrases with statistical information also can realize the recognition of some non-compositional phrase pairs, which is difficult for pure syntax-based or lexicon-based systems to handle.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Two-Phase Base Noun Phrase Alignment in Chinese-English Parallel Corpora

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

Chinese Base Noun Phrase Based on Multi-Class Support Vector Machines and Rules of Post-Processing
Runsheng Gan ... Meihua Wang
-
Runsheng Gan, et. al.Runsheng Gan ... Meihua Wang
01 Nov 2010
01 Nov 2010

Identification of Noun Phrase with Various Granularities
Ying Qin ... Yixin Zhong
-
Ying Qin, et. al.Ying Qin ... Yixin Zhong
01 Aug 2007
01 Aug 2007

Chinese maximal noun phrase parsing based on cascaded conditional random fields
Dongfeng Cai ... Na Ye
-
Dongfeng Cai, et. al.Dongfeng Cai ... Na Ye
01 Sep 2009
01 Sep 2009

Annotation of complex noun phrases from multilingual parallel corpus
Jingxiang Cao ... Degen Huang
-
Jingxiang Cao, et. al.Jingxiang Cao ... Degen Huang
01 Oct 2012
01 Oct 2012

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Two-Phase Base Noun Phrase Alignment in Chinese-English Parallel Corpora

Abstract

Talk to us

Similar Papers