Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis

Sagar Indurkhya,Robert C Berwick,Beracah Yankama

doi:10.18653/v1/2021.starsem-1.11

Abstract

Accurate recovery of predicate-argument structure from a Universal Dependency (UD) parse is central to downstream tasks such as extraction of semantic roles or event representations. This study introduces compchains, a categorization of the hierarchy of predicate dependency relations present within a UD parse. Accuracy of compchain classification serves as a proxy for measuring accurate recovery of predicate-argument structure from sentences with embedding. We analyzed the distribution of compchains in three UD English treebanks, EWT, GUM and LinES, revealing that these treebanks are sparse with respect to sentences with predicate-argument structure that includes predicate-argument embedding. We evaluated the CoNLL 2018 Shared Task UDPipe (v1.2) baseline (dependency parsing) models as compchain classifiers for the EWT, GUMS and LinES UD treebanks. Our results indicate that these three baseline models exhibit poorer performance on sentences with predicate-argument structure with more than one level of embedding; we used compchains to characterize the errors made by these parsers and present examples of erroneous parses produced by the parser that were identified using compchains. We also analyzed the distribution of compchains in 58 non-English UD treebanks and then used compchains to evaluate the CoNLL’18 Shared Task baseline model for each of these treebanks. Our analysis shows that performance with respect to compchain classification is only weakly correlated with the official evaluation metrics (LAS, MLAS and BLEX). We identify gaps in the distribution of compchains in several of the UD treebanks, thus providing a roadmap for how these treebanks may be supplemented. We conclude by discussing how compchains provide a new perspective on the sparsity of training data for UD parsers, as well as the accuracy of the resulting UD parses.

Highlights

Accurate recovery of predicate-argument structure from a Universal Dependency (UD) parse is central to downstream tasks such as extraction of semantic roles or event representations
We used the compchain classification task to evaluate the CoNLL’18 shared task baseline models for languages other than English; this was motivated by the observation that since the UD treebanks are derived from a variety of textual sources, and have varying compchain distributions, we can use them collectively to evaluate and characterize the performance of the UDPipe dependency parser under various training conditions
Two or more in the UD treebanks suggests that we should not necessarily expect a dependency parser trained on the treebank to generalize out of the training domain, there is empirical evidence that humans do have the capacity to acquire a grammar from sentences with at most degree-1 embedding and later correctly parse sentences with a degree of embedding of two or more (Wexler and Culicover, 1980; Morgan, 1986; Lightfoot, 1989); the poor performance on compchains of length three or more suggests that the CoNLL 2018 Shared Task baseline models are not able to generalize beyond the distribution of syntactic structures they were trained upon, in contrast to human learners

Summary

Introduction

Accurate recovery of predicate-argument structure from a Universal Dependency (UD) parse is central to downstream tasks such as extraction of semantic roles or event representations. The Universal Dependencies (UD) project (De Marneffe et al, 2014; Nivre et al, 2016) is a multilingual annotation scheme for dependency grammars that has gained wide usage (Zeman et al, 2017; Kong et al, 2017; Qi et al, 2020) To this extent, automatically identifying whether a dependency parse is correct or incorrect, as well as the potential source of such errors, becomes an important part of NLP pipelines. Automatically identifying whether a dependency parse is correct or incorrect, as well as the potential source of such errors, becomes an important part of NLP pipelines Such identification can prevent errors from propagating to downstream applications such as the identification of predicate-argument structure, involved in semantic role labeling and sentiment analysis.. In this study we introduce compchains, a categorization of the hierarchy of predicate dependency relations present within a Universal Dependency (UD) parse; this categorization serves as a proxy for

Related Work

Compchains

Evaluation of English UD Treebanks

Multilingual Evaluation of UD Treebanks

Impact of Word Ordering

Conclusion

Findings

A Appendix

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers

Lead the way for us

Publication Date: Jan 1, 2021
Citations: 1	License type: cc-by

Similar Papers

Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis
...
-
, et. al. ...
22 Jul 2021
22 Jul 2021

A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application
Ran Iwamoto ... Takuya Ohko
-
Ran Iwamoto, et. al.Ran Iwamoto ... Takuya Ohko
01 Jan 2020
01 Jan 2020

Extracting Valences from a Dependency Treebank for Populating the Verb Lexicon of a Portuguese HPSG Grammar
Leonel Figueiredo De Alencar ... Alexandre Rademaker
-
Leonel Figueiredo De Alencar, et. al.Leonel Figueiredo De Alencar ... Alexandre Rademaker
01 Jan 2021
01 Jan 2021

Extracting valences from a dependency treebank for populating the verb lexicon of a Portuguese HPSG grammar
...
-
, et. al. ...
11 Mar 2022
11 Mar 2022

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Universal Dependency Parser Recovery of Predicate Argument Structure via CompChain Analysis

Abstract

Highlights

Summary

Talk to us

Similar Papers