Creation of a High-quality, Register-diversified Parallel (English-Spanish) Corpus for Linguistic and Computational Investigations

Julia Lavid,Jorge Arús,Bernard Declerck,Veronique Hoste

doi:10.1016/j.sbspro.2015.07.443

Abstract

This paper outlines current work on the construction of a high-quality, richly-annotated and register-diversified parallel corpus for the English-Spanish language pair, as currently carried out within the framework of the MULTINOT project. The corpus consists of original and translated texts in both directions and is designed as a multifunctional resource to be used in a number of disciplines such as corpus-based contrastive linguistic and translation studies, machine translation, computer-assisted translation, computer-assisted language learning and terminology extraction. The paper describes the structure of the corpus –which includes four subcorpora: English originals (EO) and Spanish originals (SO), English translations (Etrans) and Spanish translations (Strans)-, the registers selected for inclusion in the corpus, and the methodology used to guarantee the quality of the processing steps to enrich the corpus with linguistic information at different levels.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Procedia - Social and Behavioral Sciences	Publication Date: Jul 1, 2015
Citations: 3	License type: cc-by-nc-nd

R Discovery Prime

R Discovery Prime

Creation of a High-quality, Register-diversified Parallel (English-Spanish) Corpus for Linguistic and Computational Investigations

Abstract

Talk to us

Similar Papers

More From: Procedia - Social and Behavioral Sciences

Lead the way for us

Similar Papers

Dutch Parallel Corpus: A Balanced Parallel Corpus for Dutch-English and Dutch-French
Hans Paulussen ... Willy Vandeweghe
-
Hans Paulussen, et. al.Hans Paulussen ... Willy Vandeweghe
11 Nov 2012
11 Nov 2012

Translation and Technology by C. K. Quah
Pius Ten Hacken
Modern Language Review | VOL. 102
Pius Ten HackenPius Ten Hacken
01 Jan 2007
Modern Language Review | VOL. 102

How to use corpora for translation
Silvia Bernardini
-
Silvia BernardiniSilvia Bernardini
24 Jan 2022
24 Jan 2022

Machine Translation and Computer Aided English Translation
Chuanhua Xu ... Qianqian Li
Journal of Physics: Conference Series | VOL. 1881
Chuanhua Xu, et. al.Chuanhua Xu ... Qianqian Li
01 Apr 2021
Journal of Physics: Conference Series | VOL. 1881

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Creation of a High-quality, Register-diversified Parallel (English-Spanish) Corpus for Linguistic and Computational Investigations

Abstract

Talk to us

Similar Papers

More From: Procedia - Social and Behavioral Sciences