Abstract

This paper addresses the question to what extent translations in bilingual parallel corpora match with dictionary senses. Automatic matching of corpus translation with dictionary senses depends on the quality of the lexicographic knowledge used, the quality of corpus processing, the impact of statistics to filter relevant entries from the corpora, and finally the quality of the translations in the multilingual corpora. We focus on the influence that the latter variable has on the performance of the automatic matching. Similarly to previous approaches, we relied on Machine Readable Dictionaries (MRDs), a part-of-speech tagger, and bilingual aligned corpora. Additionally, we used a shallow sentence parser for syntactic matching. Two case studies with two different corpora from different domains were conducted. Our test set was the intersection of 500 French communication verbs within the corpora. The results confirm that the performance of the automatic matching varies considerably with the translation quality of the parallel texts.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call