Abstract

In this article we present an automatic approach to extracting Hindi-English (H-E) Named Entity (NE) translingual equivalences from bilingual parallel corpora. In the absence of a Hindi NE tagger or H-E translation dictionary, this approach adapts a Chinese-English (C-E) surface string transliteration model for H-E NE extraction. The model is initially trained using automatically extracted C-E NE pairs, then iteratively updated based on newly extracted H-E NE pairs. For each English person and location NE in each sentence pair, this approach searches for its Hindi correspondence with minimum transliteration cost and constructs an H-E NE list from the bilingual corpus. Experiments show that this approach extracted 1000 H-E NE pairs with a precision of 91.8%.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.