Abstract

In practical applications, the accuracy of domain terminology translation is an important criterion for the performance evaluation of domain machine translation models. Aiming at the problem of phrase mismatch and improper translation caused by word-by-word translation of English terminology phrases, this paper constructs a dictionary of terminology phrases in the field of electrical engineering and proposes three schemes to integrate the dictionary knowledge into the translation model. Scheme 1 replaces the terminology phrases of the source language. Scheme 2 uses the residual connection at the encoder end after the terminology phrase is replaced. Scheme 3 uses a segmentation method of combining character segmentation and terminology segmentation for the target language and uses an additional loss module in the training process. The results show that all three schemes are superior to the baseline model in two aspects: BLEU value and correct translation rate of terminology words. In the test set, the highest accuracy of terminology words was 48.3% higher than that of the baseline model. The BLEU value is up to 3.6 higher than the baseline model. The phenomenon is also analyzed and discussed in this paper.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call