Abstract
In this paper, we propose a novel recursive recurrent neural network (R 2 NN) to model the end-to-end decoding process for statistical machine translation. R 2 NN is a combination of recursive neural network and recurrent neural network, and in turn integrates their respective capabilities: (1) new information can be used to generate the next hidden state, like recurrent neural networks, so that language model and translation model can be integrated naturally; (2) a tree structure can be built, as recursive neural networks, so as to generate the translation candidates in a bottom up manner. A semi-supervised training approach is proposed to train the parameters, and the phrase pair embedding is explored to model translation confidence directly. Experiments on a Chinese to English translation task show that our proposed R 2 NN can outperform the stateof-the-art baseline by about 1.5 points in BLEU.
Highlights
Deep Neural Network (DNN), which essentially is a multi-layer neural network, has re-gained more and more attentions these years
Word embedding xt is integrated as new input information in recurrent neural networks for each prediction, but in recursive neural networks, no additional input information is used except the two representation vectors of the child nodes
We propose a Recursive Recurrent Neural Network(R2NN) to combine the recurrent neural network and recursive neural network
Summary
Deep Neural Network (DNN), which essentially is a multi-layer neural network, has re-gained more and more attentions these years. DNN is introduced to Statistical Machine Translation (SMT) to learn several components or features of conventional framework, including word alignment, language modelling, translation modelling and distortion modelling. Auli et al (2013) propose a joint language and translation model, based on a recurrent neural network. Their model predicts a target word, with an unbounded history of both source and target words. Different from the work mentioned above, which applies DNN to components of conventional SMT framework, in this paper, we propose a novel R2NN to model the end-to-end decoding process. All the representations of nodes are generated based on their child nodes, and it is difficult to integrate additional global information, such as language model and distortion model.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.