Abstract

-ing forms in English are reported to be problematic for Machine Transla-tion and are often the focus of rules in Controlled Language rule sets. We investigated how problematic -ing forms are for an RBMT system, translat-ing into four target languages in the IT domain. Constituent-based human evaluation was used and the results showed that, in general, -ing forms do not deserve their bad reputation. A comparison with the results of five automated MT evaluation metrics showed promising correlations. Some issues prevail, however, and can vary from target language to target lan-guage. We propose different strategies for dealing with these problems, such as Controlled Language rules, semi-automatic post-editing, source text tagging and “post-editing” the source text.

Highlights

  • The focus of this paper is on evaluating the Machine Translation (MT) output for one linguistic feature, -ing forms, into four target languages (French, Spanish, German and Japanese)

  • There is at least some consensus, that -ing forms can be problematic for RBMT

  • Since our research focuses on evaluating the RBMT output for -ing forms and little work has to date been done using automated metrics for specific sub-sentential linguistic constituents (with the exception of constituents such as subjects, NPs and CNPs evaluated by Callison-Burch et al (2007)), we opted for a human evaluation

Read more

Summary

Introduction

The focus of this paper is on evaluating the Machine Translation (MT) output for one linguistic feature, -ing forms, into four target languages (French, Spanish, German and Japanese). Our interest in -ing forms stems from our study of Controlled Language (CL). CL rules can be implemented to reduce ambiguities in the source text in order to improve the machine translated output (Bernth and Gdaniec, 2001; O’Brien, 2003). CL rule sets often include one or more rules on -ing forms in English. According to Derviševíc and Steensland (2005), AECMA Simplified English does not allow the use of either gerunds or present participles, with the exception of certain technical terms. The Microsoft Manual of Style for Technical Publications (MSTP) (Microsoft Corporation, 1998) cautions against the use of gerunds. The following example, taken from our research corpus, illustrates the problem: Aranberri-Monasterio & O’Brien

Objectives
Methods
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.