Abstract

Syntactic dependency parsing is an important task in natural language processing. Unsupervised dependency parsing aims to learn a dependency parser from sentences that have no annotation of their correct parse trees. Despite its difficulty, unsupervised parsing is an interesting research direction because of its capability of utilizing almost unlimited unannotated text data. It also serves as the basis for other research in low-resource parsing. In this paper, we survey existing approaches to unsupervised dependency parsing, identify two major classes of approaches, and discuss recent trends. We hope that our survey can provide insights for researchers and facilitate future research on this topic.

Highlights

  • Dependency parsing is an important task in natural language processing that aims to capture syntactic information in sentences in the form of dependency relations between words

  • We focus on learning a dependency parser from an unannotated dataset that consists of a set of sentences without any parse tree annotation

  • Two evaluation metrics are widely used in previous work of unsupervised dependency parsing (Klein and Manning, 2004): directed dependency accuracy (DDA) and undirected dependency accuracy (UDA)

Read more

Summary

Introduction

Dependency parsing is an important task in natural language processing that aims to capture syntactic information in sentences in the form of dependency relations between words. There are multiple research directions that try to learn dependency parsers with few or even no syntactically annotated training sentences, including transfer learning, unsupervised learning, and semisupervised learning. Among these directions, unsupervised learning of dependency parsers (a.k.a. unsupervised dependency parsing and dependency grammar induction) is the most challenging, which aims to obtain a dependency parser without using annotated sentences. Despite its difficulty, unsupervised parsing is an interesting research direction, because it would reveal ways to utilize almost unlimited text data without the need for human annotation, and because it can serve as the basis for studies of transfer and semi-supervised learning of parsers. Proceedings of the 28th International Conference on Computational Linguistics, pages 2522–2533 Barcelona, Spain (Online), December 8-13, 2020

Problem Definition
Related Areas
Models
Inference
Learning Objective
Learning Algorithm
Pros and Cons
Discriminative Approaches
Autoencoder-Based Approaches
Variational Autoencoder-Based Approaches
Other Discriminative Approaches
Combined Approaches
Neural Parameterization
Lexicalization
METHODS
Big Data
Unsupervised Multilingual Parsing
Benchmarking on the WSJ Corpus
Utilization of Syntactic Information in Pretrained Language Modeling
Inspiration for Other Tasks
Interpretability
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call