Abstract
We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks. We evaluate these chunks on a number of morphosyntactic tasks, namely POS tagging, morphological feature tagging, and dependency parsing. We test the utility of these chunks in a host of different ways. We first learn chunking as one task in a shared multi-task framework together with POS and morphological feature tagging. The predictions from this network are then used as input to augment sequence-labelling dependency parsing. Finally, we investigate the impact chunks have on dependency parsing in a multi-task framework. Our results from these analyses show that these chunks improve performance at different levels of syntactic abstraction on English UD treebanks and a small, diverse subset of non-English UD treebanks.
Highlights
Shallow parsing, or chunking, consists of identifying constituent phrases (Abney, 1997)
As universal dependency (UD) treebanks do not contain chunking annotation, they deduced chunks by adopting linguistic-based phrase rules. They observed improvements on POS and morphological feature tagging in a shared multi-task framework for the English treebanks in UD version 2.1 (Nivre et al, 2017)
We show that chunking information can improve performances for POS tagging, morphological feature tagging, and dependency parsing, both in a multi-task and a single-task framework
Summary
Chunking, consists of identifying constituent phrases (Abney, 1997). As such, it is fundamentally associated with constituency parsing, as it can be used as a first step for finding a full constituency tree (Ciravegna and Lavelli, 1999; Tsuruoka and Tsujii, 2005). Lacroix (2018) explored the efficacy of noun phrase (NP) chunking with respect to universal dependency (UD) parsing and POS tagging for English treebanks. As UD treebanks do not contain chunking annotation, they deduced chunks by adopting linguistic-based phrase rules. They observed improvements on POS and morphological feature tagging in a shared multi-task framework for the English treebanks in UD version 2.1 (Nivre et al, 2017). An increase in performance for parsing was only obtained for one treebank
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.