Abstract

Automatic evaluation of machine translation is critical for the evaluation and development of machine translation systems. In this study, we propose a new model for automatic evaluation of machine translation. The proposed model combines standard n-gram precision features and sentence semantic mapping features with neural features, including neural language model probabilities and the embedding distances between translation outputs and their reference translations. We optimize the model with a representative list-wise learning to rank approach, ListMLE, in terms of human ranking assessments. The experimental results on WMT’2015 Metrics task indicated that the proposed approach yields significantly better correlations with human assessments than several state-of-the-art baseline approaches. In particular, the results confirmed that the proposed list-wise learning to rank approach is useful and powerful for optimizing automatic evaluation metrics in terms of human ranking assessments. Deep analysis also demonstrated that optimizing automatic metrics with the ListMLE approach is a reasonable method and adding the neural features can gain considerable improvements compared with the traditional features.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call