Translation Alignment with Ugarit

Tariq Yousef,Maryam Foradi,Chiara Palladino,Farnoosh Shamsian

doi:10.3390/info13020065

Abstract

Ugarit is a public web-based tool for manual annotation of parallel texts for generating word-level translation alignment. We aimed to develop a user-friendly interactive interface to visualize aligned texts and collect training data in the form of translation pairs to be used later, (i) for training an automatic translation alignment system for historical languages at the word/phrase level, (ii) as a gold standard to evaluate automatic alignment and machine translation systems. Ugarit is now widely used for learning new languages, especially historical languages, and as a reading environment for parallel texts. In the following sections, we present the related works and similar projects; then, we give an overview of the visualization techniques used to present the alignment results. Further, we explain how we could derive the translation graph from the aligned translation pairs. Finally, we discuss the usage limitations of Ugarit, possible improvements, and future development plans.

Highlights

Translation alignment is a major task in Digital Humanities and Natural LanguageProcessing
The accuracy of the automatic alignment varies according to multiple factors, such as text type and length, size of the corpus, and translation quality and consistency
We describe the development process and show how manual alignment can be performed in U GARIT

Summary

Introduction

Translation alignment is a major task in Digital Humanities and Natural Language. Processing. It is the process of comparing two texts in different languages to find translation correspondences among the textual units in the source and translation texts [1] It can be performed at various granularity levels according to the project’s context or the research purpose. We can mention the Blinker Project [20], which developed the first annotation tool for manual text alignment to align different versions of the Bible in French and English at the word level. U GARIT was initially designed to visualize the automatically aligned texts available at Perseus Digital Library [32] and collect training data in the form of translation pairs to implement a statistical translation alignment system for historical languages, mainly Ancient. We discuss the limitations, possible improvements, and new features we intend to integrate into the release of U GARIT

Development Process

Alignment Workflow

Visualization Techniques

Languages Graph

Aligned Texts

Translations Graph

U GARIT in Research and Pedagogy

Future Work

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Translation Alignment with Ugarit

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information

Lead the way for us

Journal: Information	Publication Date: Jan 27, 2022
License type: CC BY 4.0

Similar Papers

<title>Pattern Recognition Automatic Fine Alignment</title>
D H Berry
-
D H BerryD H Berry
13 Sep 1982
13 Sep 1982

Robust local intervertebral disc alignment for spinal MRI
James Reisman ... Benjamin Odry
-
James Reisman, et. al.James Reisman ... Benjamin Odry
02 Mar 2006
02 Mar 2006

Analysis and Prediction of Unalignable Words in Parallel Text
Frances Yung ... Yuji Matsumoto
-
Frances Yung, et. al.Frances Yung ... Yuji Matsumoto
01 Jan 2014
01 Jan 2014

QCRI's Live Speech Translation System
Fahim Dalvi ... Stephan Vogel
-
Fahim Dalvi, et. al.Fahim Dalvi ... Stephan Vogel
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Translation Alignment with Ugarit

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Information