Abstract
Proper nouns in metadata are representative features for linking the identical records across data sources in different languages. In order to improve the accuracy of proper noun recognition, we propose a back-transliteration method, in which transliterated words in target language are back-transliterated to their original words in source language. The acquired words and their transliterations are employed to recognize and transliterate proper nouns in metadata. Experimental results show the usage of the bilingual words that we have obtained can improve the accuracy of cross-language record linkage.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have