Abstract
We propose the use of pre-trained embeddings as features of a regression model for sentence-level quality estimation of machine translation. In our work we combine freely available BERT and LASER multilingual embeddings to train a neural-based regression model. In the second proposed method we use as an input features not only pre-trained embeddings, but also log probability of any machine translation (MT) system. Both methods are applied to several language pairs and are evaluated both as a classical quality estimation system (predicting the HTER score) as well as an MT metric (predicting human judgements of translation quality).
Highlights
Quality estimation (Blatz et al, 2004; Specia et al, 2009) aims to predict the quality of machine translation (MT) outputs without human references, which is what sets it apart from translation metrics like BLEU (Papineni et al, 2002) or TER (Snover et al, 2006)
LABEL: embeddings extracted from LASER and BERT and log probability obtained from Transformer NMT model
Data We gathered the data from WMT16 - WMT18 shared tasks on sentence-level quality estimation for English-German (En-De) (Bojar et al, 2016a, 2017a; Specia et al, 2018), from WMT17 - WMT18 German-English (De-En) and from WMT 18 English-Czech (En-Cs)
Summary
Quality estimation (Blatz et al, 2004; Specia et al, 2009) aims to predict the quality of machine translation (MT) outputs without human references, which is what sets it apart from translation metrics like BLEU (Papineni et al, 2002) or TER (Snover et al, 2006). Most approaches to quality estimation are trained to predict the post-editing effort, i.e. the number of corrections the translators have to make in order to get an adequate translation. The effort is measured by the HTER metric (Snover et al, 2006) applied to human post-edits. Besides that we apply our method to predict direct human assessment (DA) (Graham et al, 2017). MT metrics (Ma et al, 2018) are compared to DA, but we decided to compare our predictions as well, because there is a difference between a number of post-edits and a human assessment. The main difference between MT metrics and quality estimation is that quality estimation is computing without reference sentences
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.