Abstract
English machine translation is a natural language processing research direction that has important scientific research value and practical value in the current artificial intelligence boom. The variability of language, the limited ability to express semantic information, and the lack of parallel corpus resources all limit the usefulness and popularity of English machine translation in practical applications. The self-attention mechanism has received a lot of attention in English machine translation tasks because of its highly parallelizable computing ability, which reduces the model’s training time and allows it to capture the semantic relevance of all words in the context. The efficiency of the self-attention mechanism, however, differs from that of recurrent neural networks because it ignores the position and structure information between context words. The English machine translation model based on the self-attention mechanism uses sine and cosine position coding to represent the absolute position information of words in order to enable the model to use position information between words. This method, on the other hand, can reflect relative distance but does not provide directionality. As a result, a new model of English machine translation is proposed, which is based on the logarithmic position representation method and the self-attention mechanism. This model retains the distance and directional information between words, as well as the efficiency of the self-attention mechanism. Experiments show that the nonstrict phrase extraction method can effectively extract phrase translation pairs from the n-best word alignment results and that the extraction constraint strategy can improve translation quality even further. Nonstrict phrase extraction methods and n-best alignment results can significantly improve the quality of translation translations when compared to traditional phrase extraction methods based on single alignment.
Highlights
After decades of development and evolution in English machine translation, with the continuous improvement of information technology and computer technology, the research on English machine translation has gradually evolved from the original simple linguistics and computational sciences [1, 2]
English machine translation has not reached the level of fully intelligent understanding of semantic information, and it is necessary to continuously give computers the ability to recognize and understand [7, 8]
Related scholars provide a seed sentence segmentation method for the tree-based English machine translation system [25]. is method first divides the long sentence into shorter clauses, translates the clauses, and merges the subtranslations to generate the full sentence translation. is method analyzes the syntax tree generated by the existing syntax analyzer to realize the segmentation of long sentences and the merging of translations
Summary
After decades of development and evolution in English machine translation, with the continuous improvement of information technology and computer technology, the research on English machine translation has gradually evolved from the original simple linguistics and computational sciences [1, 2] It transforms into a comprehensive research field that integrates semantics, mathematics, corpus, computing science, artificial intelligence, and biological sciences. When the workload of translation is increased, the cost of manual translation is much higher than the cost of English machine translation It takes a very long time and consumes a lot of manpower to train a small language talent with professional knowledge reserves [16]. Experiments show that the nonstrict phrase extraction method is more suitable for extracting phrases on the N-best alignment, and imposing extraction constraints can further improve the translation quality
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have