Improved CCG Parsing with Semi-supervised Supertagging

Mike Lewis,Mark Steedman

doi:10.1162/tacl_a_00186

Mike Lewis, Mark Steedman

Open Access

https://doi.org/10.1162/tacl_a_00186

Copy DOI

Abstract

Current supervised parsers are limited by the size of their labelled training data, making improving them with unlabelled data an important goal. We show how a state-of-the-art CCG parser can be enhanced, by predicting lexical categories using unsupervised vector-space embeddings of words. The use of word embeddings enables our model to better generalize from the labelled data, and allows us to accurately assign lexical categories without depending on a POS-tagger. Our approach leads to substantial improvements in dependency parsing results over the standard supervised CCG parser when evaluated on Wall Street Journal (0.8%), Wikipedia (1.8%) and biomedical (3.4%) text. We compare the performance of two recently proposed approaches for classification using a wide variety of word embeddings. We also give a detailed error analysis demonstrating where using embeddings outperforms traditional feature sets, and showing how including POS features can decrease accuracy.

Highlights

Combinatory Categorial Grammar (CCG) is widely used in natural language semantics (Bos, 2008; Kwiatkowski et al, 2010; Krishnamurthy and Mitchell, 2012; Lewis and Steedman, 2013a; Lewis and Steedman, 2013b; Kwiatkowski et al, 2013), largely because of its direct linkage of syntax and semantics
CCG parsers perform at state-of-the-art levels (Rimell et al, 2009; Nivre et al, 2010), fullsentence accuracy is just 25.6% on Wikipedia text, which gives a low upper bound on logical inference approaches to question-answering and textual entailment
Our results show that word embeddings are an effective way of adding distributional information into CCG supertagging

Summary

Introduction

Combinatory Categorial Grammar (CCG) is widely used in natural language semantics (Bos, 2008; Kwiatkowski et al, 2010; Krishnamurthy and Mitchell, 2012; Lewis and Steedman, 2013a; Lewis and Steedman, 2013b; Kwiatkowski et al, 2013), largely because of its direct linkage of syntax and semantics. This connection means that performance on semantic applications is highly dependent on the quality of the syntactic parse. The supertagger model is overly dependent on POS-features—in Section 4.6 we show that supertagger performance drops dramatically on words which have been assigned an incorrect POS-tag

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Dec 1, 2014
Citations: 71	License type: cc-by

R Discovery Prime

R Discovery Prime

Improved CCG Parsing with Semi-supervised Supertagging

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

Using Communities of Words Derived from Multilingual Word Vectors for Cross-Language Information Retrieval in Indian Languages
Paheli Bhattacharya ... Pawan Goyal
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18
Paheli Bhattacharya, et. al.Paheli Bhattacharya ... Pawan Goyal
17 Dec 2018
ACM Transactions on Asian and Low-Resource Language Information Processing | VOL. 18

Explorations into the Use of Word Embedding in Math Search and Math Semantics
Abdou Youssef ... Bruce R Miller
-
Abdou Youssef, et. al.Abdou Youssef ... Bruce R Miller
01 Jan 2019
01 Jan 2019

Exploring the impact of word embeddings for disjoint semisupervised Spanish verb sense disambiguation
Cristian Cardellino ... Laura Alonso Alemany
Inteligencia Artificial | VOL. 21
Cristian Cardellino, et. al.Cristian Cardellino ... Laura Alonso Alemany
21 Mar 2018
Inteligencia Artificial | VOL. 21

Debate Stance Classification Using Word Embeddings
Anand Konjengbam ... Subrata Ghosh
-
Anand Konjengbam, et. al.Anand Konjengbam ... Subrata Ghosh
01 Jan 2018
01 Jan 2018

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Improved CCG Parsing with Semi-supervised Supertagging

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics