An HDP Model for Inducing Combinatory Categorial Grammars

Yonatan Bisk,Julia Hockenmaier

doi:10.1162/tacl_a_00211

Abstract

We introduce a novel nonparametric Bayesian model for the induction of Combinatory Categorial Grammars from POS-tagged text. It achieves state of the art performance on a number of languages, and induces linguistically plausible lexicons.

Highlights

What grammatical representation is appropriate for unsupervised grammar induction? Initial attempts with context-free grammars (CFGs) were not very successful (Carroll and Charniak, 1992; Charniak, 1993)
Dependency grammars make it difficult to capture non-local structures, and Blunsom and Cohn (2010) show that it may be advantageous to reformulate the underlying dependency grammar in terms of a tree-substitution grammar (TSG) which pairs words with treelets that specify the number of left and right dependents they have. We explore yet another option: instead of dependency grammars, we use Combinatory Categorial Grammar (CCG, Steedman (1996; 2000)), a linguistically expressive formalism that pairs lexical items with rich categories that capture all language-specific information
7.1 PASCAL Challenge on Grammar Induction In Table 1, we compare the performance of the basic Argument model (MLE), of our Hierarchical Dirichlet Processes (HDP) model with four different settings of the hyperparameters and of the systems presented in the PASCAL Challenge on Grammar Induction (Gelling et al, 2012)

Summary

Introduction

What grammatical representation is appropriate for unsupervised grammar induction? Initial attempts with context-free grammars (CFGs) were not very successful (Carroll and Charniak, 1992; Charniak, 1993). We explore yet another option: instead of dependency grammars, we use Combinatory Categorial Grammar (CCG, Steedman (1996; 2000)), a linguistically expressive formalism that pairs lexical items with rich categories that capture all language-specific information. This may seem a puzzling choice, since CCG requires a significantly larger inventory of categories than is commonly assumed for CFGs. unlike CFG nonterminals, CCG categories are not arbitrary symbols: they encode, and are determined by, the basic word order of the language and the number of arguments each word takes.

Methods

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Transactions of the Association for Computational Linguistics	Publication Date: Dec 1, 2013
Citations: 70	License type: cc-by

R Discovery Prime

R Discovery Prime

An HDP Model for Inducing Combinatory Categorial Grammars

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics

Lead the way for us

Similar Papers

Bayesian Nonparametric Models
Peter Orbanz ... Yee Whye Teh
-
Peter Orbanz, et. al.Peter Orbanz ... Yee Whye Teh
01 Jan 2017
01 Jan 2017

Bayesian Nonparametric Models
...
-
, et. al. ...
07 Feb 2012
07 Feb 2012

The comparison of the scores obtained by Bayesian nonparametric model and classical test theory methods.
Meltem Yurtcu ... Hülya Kelecioglu
Science progress | VOL. 104
Meltem Yurtcu, et. al.Meltem Yurtcu ... Hülya Kelecioglu
01 Jul 2021
Science progress | VOL. 104

Modelling a preference-based index for EQ-5D using a non-parametric Bayesian method.
Samer A Kharroubi ... Chaza Abou Daher
Quality of Life Research | VOL. 27
Samer A Kharroubi, et. al.Samer A Kharroubi ... Chaza Abou Daher
14 Jul 2018
Quality of Life Research | VOL. 27

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

An HDP Model for Inducing Combinatory Categorial Grammars

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Transactions of the Association for Computational Linguistics