Abstract

We introduce a language-agnostic evolutionary technique for automatically extracting chunks from dependency treebanks. We evaluate these chunks on a number of morphosyntactic tasks, namely POS tagging, morphological feature tagging, and dependency parsing. We test the utility of these chunks in a host of different ways. We first learn chunking as one task in a shared multi-task framework together with POS and morphological feature tagging. The predictions from this network are then used as input to augment sequence-labelling dependency parsing. Finally, we investigate the impact chunks have on dependency parsing in a multi-task framework. Our results from these analyses show that these chunks improve performance at different levels of syntactic abstraction on English UD treebanks and a small, diverse subset of non-English UD treebanks.

Highlights

  • Shallow parsing, or chunking, consists of identifying constituent phrases (Abney, 1997)

  • As universal dependency (UD) treebanks do not contain chunking annotation, they deduced chunks by adopting linguistic-based phrase rules. They observed improvements on POS and morphological feature tagging in a shared multi-task framework for the English treebanks in UD version 2.1 (Nivre et al, 2017)

  • We show that chunking information can improve performances for POS tagging, morphological feature tagging, and dependency parsing, both in a multi-task and a single-task framework

Read more

Summary

Introduction

Chunking, consists of identifying constituent phrases (Abney, 1997). As such, it is fundamentally associated with constituency parsing, as it can be used as a first step for finding a full constituency tree (Ciravegna and Lavelli, 1999; Tsuruoka and Tsujii, 2005). Lacroix (2018) explored the efficacy of noun phrase (NP) chunking with respect to universal dependency (UD) parsing and POS tagging for English treebanks. As UD treebanks do not contain chunking annotation, they deduced chunks by adopting linguistic-based phrase rules. They observed improvements on POS and morphological feature tagging in a shared multi-task framework for the English treebanks in UD version 2.1 (Nivre et al, 2017). An increase in performance for parsing was only obtained for one treebank

Objectives
Methods
Results
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call