Abstract

Parallel text corpora supply researchers with data for multilingual lexicographic research, translation studies, and language typology. The objectives of the ParRus research project at the University of Tampere are to compile a Russian-Finnish parallel corpus and to develop the software for the maintenance of the corpus. Text aligning is the crucial problem in compiling parallel corpora. The study of parallel texts shows that, in most cases, the translator retains paragraphs of the original in the translation. The Source Language – Target Language quotient (ratio of number of words in originals to number of words in translations) is also a stable value. The aligning programme developed at the Department compares original with translation, paragraph by paragraph, adding new paragraphs to the extracts being aligned until the extracts match the SL-TL quotient. The system only produces good results if the translation is structurally close to the original. However, the study of parallel texts shows that frequency of words and their translation equivalents does not usually match. Therefore, paragraphs and larger text units are the only elements of formal text structure which can be used for comparing parallel texts, unless knowledge structures are exploited.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call