Abstract
Word alignment over parallel corpora has a wide variety of applications, including learning translation lexicons, cross-lingual transfer of language processing tools, and automatic evaluation or analysis of translation outputs. The great majority of past work on word alignment has worked by performing unsupervised learning on parallel text. Recently, however, other work has demonstrated that pre-trained contextualized word embeddings derived from multilingually trained language models (LMs) prove an attractive alternative, achieving competitive results on the word alignment task even in the absence of explicit training on parallel data. In this paper, we examine methods to marry the two approaches: leveraging pre-trained LMs but fine-tuning them on parallel text with objectives designed to improve alignment quality, and proposing methods to effectively extract alignments from these fine-tuned models. We perform experiments on five language pairs and demonstrate that our model can consistently outperform previous state-of-the-art models of all varieties. In addition, we demonstrate that we are able to train multilingual word aligners that can obtain robust performance on different language pairs.
Highlights
IntroductionWord alignment is a useful tool to tackle a variety of natural language processing (NLP) tasks, including learning translation lexicons (Ammar et al, 2016; Cao et al, 2019), cross-lingual transfer of language processing tools (Yarowsky et al, 2001; Padoand Lapata, 2009; Tiedemann, 2014; Agicet al., 2016; Mayhew et al, 2017; Nicolai and Yarowsky, 2019), semantic parsing (Herzig and Berant, 2018) and entspÄrencdheerEwunviindnrodgeer##genowmermdeenn.ThenecessaBrcyeorfroecrte##Fioinnew-iltlunbine gmade .ThenecessarAcyofrtreectr#F#iionne-wtilul ninbeg made .speech recognition (Xu et al, 2019)
We randomly sample 200k parallel sentence pairs from each language pair and concatenate them together to train multilingual word aligners
We present a neural word aligner that achieves stateof-the-art performance on five diverse language pairs and obtains robust performance in zero-shot settings
Summary
Word alignment is a useful tool to tackle a variety of natural language processing (NLP) tasks, including learning translation lexicons (Ammar et al, 2016; Cao et al, 2019), cross-lingual transfer of language processing tools (Yarowsky et al, 2001; Padoand Lapata, 2009; Tiedemann, 2014; Agicet al., 2016; Mayhew et al, 2017; Nicolai and Yarowsky, 2019), semantic parsing (Herzig and Berant, 2018) and entspÄrencdheerEwunviindnrodgeer##genowmermdeenn.ThenecessaBrcyeorfroecrte##Fioinnew-iltlunbine gmade .ThenecessarAcyofrtreectr#F#iionne-wtilul ninbeg made .speech recognition (Xu et al, 2019). One alternative to using statistical word-based translation models to learn alignments would be to instead train state-of-the-art neural machine translation (NMT) models on parallel corpora, and extract alignments therefrom, as examined by Luong et al (2015); Garg et al (2019); Zenkel et al (2020). These methods have two disadvantages ( shared with more traditional alignment methods): (1) they are directional and the source and target side are treated differently and (2) they cannot take advantage of large-scale contextualized
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.