Abstract

Researchers suggest unsupervised English machine translation to address the absence of parallel corpus in English translation. Unsupervised pretraining techniques, denoising autoencoders, back translation, and shared latent representation mechanisms are used to simulate the translation task using just monolingual corpora. This paper uses pseudo-parallel data to construct unsupervised neural machine translation (NMT) and dissimilar language pair analysis. This paper firstly analyzes the low performance of unsupervised translation on dissimilar language pairs from three aspects: bilingual word embedding quality, shared words, and word order. And artificial shared word replacement and preordering strategies are proposed to increase the shared words between dissimilar language pairs and reduce the difference in their syntactic structure, thereby improving the translation performance on dissimilar language pairs. The denoising autoencoder and shared latent representation mechanism in unsupervised English machine translation are only required in the early stage of training, and learning the shared latent representation limits the further improvement of performance in different directions. While training the denoising autoencoder by repeatedly altering the training data slows down the convergence of the model, this is especially true for divergent languages. This paper presents an unsupervised NMT model based on pseudo-parallel data to address this issue. It trains two standard supervised neural machine translation models using the pseudo-parallel corpus generated by the unsupervised neural machine translation system, which enhances translation performance and speeds convergence. Finally, the English intelligent translation model is deployed in the wireless network server, and users can access it through the wireless network.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.