Abstract

Neural machine translation (NMT) is a software that uses neural network techniques to translate text from one language to another. As NMT models are on the rise, the focus is on translating everyday mundane sentences. However, it is also necessary to start paying attention to the translation of domain-specific text, such as lyrics or poetry. For example, even one of the most famous NMT models—Google Translate—failed to give an accurate English translation of a famous Korean nursery rhyme, "Airplane" (비행기). To teach the model to retain specific information other than semantics, we need specific data which contains the exact information that we are attempting to teach. In the case of rhythmically accurate lyrics translation—translated lyrics that can be used to sing along to the original melody—we need corresponding data, containing lyrical and rhythmical properties, all the while being semantically accurate. However, as there is not enough data that fits our criteria, we propose a novel method we call 'stacked fine-tuning'. We fine-tuned a pre-trained model first with a dataset from the lyrics domain, and then with a smaller dataset containing the rhythmical properties, to teach the model to translate rhythmically accurate lyrics. To evaluate the effectiveness of our approach, we translated two famous Korean nursery rhymes to English and matched them to the original melody. Our stacked fine-tuning method resulted in an NMT model that could maintain the rhythmical characteristics of lyrics during translation while single fine-tuned models failed to do so.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.