Abstract

This paper uses neural network as a predictive model and genetic algorithm as an online optimization algorithm to simulate the noise processing of Chinese-English parallel corpus. At the same time, according to the powerful random global search mechanism of genetic algorithm, this paper studied the principle and process of noise processing in Chinese-English parallel corpus. Aiming at the task of identifying isolated words for unspecified persons, taking into account the inadequacies of the algorithms in standard genetic algorithms and neural networks, this paper proposes a fast algorithm for training the network using genetic algorithms. Through simulation calculations, different characteristic parameters, the number of training samples, background noise, and whether a specific person affects the recognition result were analyzed and discussed and compared with the traditional dynamic time comparison method. This paper introduces the idea of reinforcement learning, uses different reward mechanisms to solve the inconsistency of loss function and evaluation index measurement methods, and uses different decoding methods to alleviate the problem of exposure bias. It uses various simple genetic operations and the survival of the fittest selection mechanism to guide the learning process and determine the direction of the search, and it can search multiple regions in the solution space at the same time. In addition, it also has the advantage of not being restricted by the restrictive conditions of the search space (such as differentiable, continuous, and unimodal). At the same time, a method of using English subword vectors to initialize the parameters of the translation model is given. The research results show that the neural network recognition method based on genetic algorithm which is given in this paper shows its ability of quickly learning network weights and it is superior to the standard in all aspects. The performance of the algorithm in genetic algorithm and neural network, with high recognition rate and unique application advantages, can achieve a win-win of time and efficiency.

Highlights

  • Existing Chinese-English parallel corpus noise processing systems with high accuracy rate still have the disadvantages of time consumption, high cost, and inconvenient use [1].e actual voice recognition system requires real-time Chinese-English parallel corpus noise processing on a general-purpose computer with limited resources [2].erefore, the development of fast recognition algorithms has been important in the study on noise processing of Chinese-English parallel corpora

  • When the normal random variable Y is used to replace the nonnormal random variable X, the cumulative probability distribution function value and the probability density function value at the design checking point x are the same as the original variable. e two types of recognition are compared and analyzed. e network parameters are set to error 0.001, training times to 100, the initial weight of the neural network is set in the range of [−1, 1], and the established network is trained and tested

  • With the help of the subword segmentation results, a word vector with the subword granularity is generated to improve the word vector quality of low-frequency words. e experimental results prove that the subword vector method is effective in transmitting large-scale monolingual corpus information to the translation model for auxiliary training and the accuracy of the translation model can be improved by up to 1.79%

Read more

Summary

Introduction

Existing Chinese-English parallel corpus noise processing systems with high accuracy rate still have the disadvantages of time consumption, high cost, and inconvenient use [1]. It is an effective and convenient way of information exchange, and an important tool for humans to use machines Whether it is the language communication between humans and machines, the noise processing of Chinese-English parallel corpus, especially the digital processing of voice signals, has a important role [6]. Neural network-based machine translation methods have become the mainstream method in the research field At this stage, in order to overcome the gradient disappearance and gradient explosion problems that may be caused by the classic recurrent neural network model, the nodes of the network usually use complex structures such as LSTM (Long-Short Term Memory) and its variant GRU (Gated Recurrent Unit), so that model training is slow. 2. Chinese-English Parallel Corpus Noise Processing Model Based on Multilayer Perceptron Genetic Algorithm Neural Network. In the fields of system identification, pattern recognition, etc., because the problem is for a specific system, it is easier to eliminate noisy data; in contrast, in most genetic algorithms, people know very little about the classification discriminant function information, which leads to

F2 F2 F2 F2 F2 F3 F3 F3
Results and Analysis
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call