Abstract
This work explores the application of minimum description length (MDL) inference to estimate the parameters of phrase-based statistical machine translation (SMT) models. In comparison with current inference techniques that rely on a long decoupled pipeline with multiple heuristic steps, MDL is a well-founded theoretically sound approach whose empirical results are however below those of the heuristically motivated state-of-the-art training pipeline. We identify potential limitations of MDK inference when applied to natural language and propose practical approaches to overcome them when inferring SMT models. The evaluation in a Spanish-to-English translation task demonstrates that MDL inference can be adapted to yield a performance close to the state of the art.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.