Abstract

In this article we will describe the design and implementation of Jane, an efficient hierarchical phrase-based (HPB) toolkit developed at RWTH Aachen University. The system has been used by RWTH at several international evaluation campaigns, including the WMT and NIST evaluations, and is now freely available for non-commercial application. We will go through the main features of Jane, which include, among others, support for different search strategies, different language model formats, support for syntax-based enhancements to the HPB machine translation paradigm, string-to-dependency translation, extended lexicon models, different methods for minimum-error-rate training and distributed operation on a computer cluster. Special attention has been paid to the efficiency of the decoder, clean code and quality assurance through unit and regression testing. Results on current machine translation tasks are reported, which show that the system is able to obtain state-of-the-art performance.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call