Multilingual term extraction from parallel corpora – A methodology for the automatic extraction of verbal structures and their translation equivalents

Tamás Váradi,Enikő Héja

doi:10.1556/materm.4.2011.2.10

Abstract

Summary The aim of this paper is to confirm that the methodology used to extract one-token translation candidates from parallel corpora can be extended for the purposes of retrieving multi-word verbal structures. The relevance of this technique from a terminological point of view is that it provides terminologists with empirical data and ample term-candidates, thus facilitating their work. Verbal structures were retrieved from the parallel corpus in a semi-automatic way: a broader range of automatically recognized verbal structures were manually narrowed down to a smaller set of verbal structures that are relevant from a translation point of view. In the next step, every occurrence of the selected verbal structures was merged into a one-token unit in the parallel corpus, so that they could serve as input to the alignment algorithm. Finally, a core dictionary is obtained comprising multi-word verbal structures, their possible translations and the contexts in which they appear.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Multilingual term extraction from parallel corpora – A methodology for the automatic extraction of verbal structures and their translation equivalents

Abstract

Talk to us

Similar Papers

More From: Magyar Terminológia

Lead the way for us

Journal: Magyar Terminológia	Publication Date: Dec 1, 2011
Citations: 1

Similar Papers

Construction of an English-Chinese Parallel Corpus of WHO Health Translation
Meng Ji
-
Meng JiMeng Ji
01 Jan 2017
01 Jan 2017

Indonesian-Japanese term extraction from bilingual corpora using machine learning
Muhammad Nassirudin ... Ayu Purwarianti
-
Muhammad Nassirudin, et. al.Muhammad Nassirudin ... Ayu Purwarianti
01 Oct 2015
01 Oct 2015

Quantitative Distribution of Verbal Structures with Reference to the Authorship Factor in Legal Stylistics
Edyta Więcławska
Studies in Logic, Grammar and Rhetoric | VOL. 66
Edyta WięcławskaEdyta Więcławska
19 Nov 2021
Studies in Logic, Grammar and Rhetoric | VOL. 66

Application of multilingual corpus in contrastive studies (on the example of the Bulgarian-Polish-Lithuanian parallel corpus)
Ludmila Dimitrova ... Violetta Koseska-Toszewa
Cognitive Studies | Études cognitives | VOL. -
Ludmila Dimitrova, et. al.Ludmila Dimitrova ... Violetta Koseska-Toszewa
24 Nov 2015
Cognitive Studies | Études cognitives | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Multilingual term extraction from parallel corpora – A methodology for the automatic extraction of verbal structures and their translation equivalents

Abstract

Talk to us

Similar Papers

More From: Magyar Terminológia