Abstract

Multilingual chat systems are the need of the hour for organizations who render online conversational services to their customers. This can effectively be facilitated using the robust machine translation (MT) systems. Translating user-generated contents is regarded as one of the challenging tasks for MT. As for translating conversations, it is more challenging since the meaning of any particular utterance in a conversation usually depends on its context. Moreover, chats in conversational systems are usually informal, contain code-mixed (mix of more than one language), and other grammatical inconsistencies. In this paper, we use state-of-the-art Transformer models to build our MT systems and to translate conversations between the customer service agents and customers. We propose a novel method which effectively selects contextual information from the source text of conversation to be translated. We also employ a terminology-based pseudo in-domain corpus mining strategy for fine-tuning our translation model. We evaluate our methods on the German-English WMT20 Shared Task on Chat Translation dataset, and obtain 61.1 and 63.9 BLEU points on evaluation test sets for English-to-German and German-to-English, respectively, surpassing the present state-of-the-art MT systems by 0.7 BLEU and 1.5 BLEU points, respectively. In this paper, we use state-of-the-art Transformer models to build our MT systems and to translate conversations between the customer service agents and customers. We propose a novel method which effectively selects contextual information from the source text of conversation to be translated. We also employ a terminology-based pseudo in-domain corpus mining strategy for fine-tuning our translation model. We evaluate our methods on the German-English WMT20 Shared Task on Chat Translation dataset, and obtain 61.1 and 63.9 BLEU points on evaluation test sets for English-to-German and German-to-English, respectively, surpassing the present state-of-the-art MT systems by 0.7 BLEU and 1.5 BLEU points, respectively.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.