Abstract

The corpus-based approach is an emerging approach to develop the machine translation system nowadays. Statistical Machine Translation(SMT) and Neural Machine Translation(NMT) are two corpus-based systems. NMT yields better results as compared to the traditional rule-based approach as well as a statistical-based approach. The computation complexity of the NMT system is more as compared to the SMT system due to the use of softmax function at the output layer of NMT. Due to the constraint of complexity, NMT uses fixed vocabulary, but Machine Translation (MT) is an open problem. This causes the out-of-vocabulary (OOV) in the predictions of the NMT system. To overcome these OOV words in NMT, Word Embedding (WE) has been used in Our NMT model for Punjabi to English. With WE, Byte-Pair-Encoding (BPE) has also been used to increase the effectiveness of the overall system. The system has been evaluated by using the automated evaluation tools BLEU score and Translation Error Rate (TER) score.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.