Abstract

Alignment is an important step to linguistically exploit parallel corpora. In this paper we introduce a morphological component that improves the alignment of German-English parallel texts and helps find correspondences between morphological elements on the sub-word level. This paper deals with a small aspect of an alignment system, namely the improvement of a dictionary-based distance measure through a morphological analyser. What is alignment? For the purposes of this paper we define a bilingual parallel text as a text (L1) and its translation (L2). A sentence level alignment then maps groups of L1-sentences to corresponding groups of L2-sentences. These groups are often called beads. An alignment can be viewed as a sequence of beads that covers the entire parallel text. While most beads usually express the correspondence between a single L1-sentence and a single L2-sentence, other types of beads arise when sentences are split, merged, deleted, added or changed in order by the translator. Each sentence belongs to exactly one bead. To illustrate some of the difficulties, consider the following excerpt from the very beginning of ‘The War of the Worlds’ parallel text:

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.