Evaluating Automatic LFG F-Structure Annotation for the Penn-II Treebank

Michael Burke,Andy Way,Mairéad Mccarthy,Aoife Cahill,Josef Van Genabith,Ruth O’Donovan

doi:10.1007/s11168-004-7428-y

Abstract

Lexical-Functional Grammar (LFG: Kaplan and Bresnan, 1982; Bresnan, 2001; Dalrymple, 2001) f-structures represent abstract syntactic information approximating to basic predicate-argument-modifier (dependency) structure or simple logical form (van Genabith and Crouch, 1996; Cahill et al., 2003a) . A number of methods have been developed (van Genabith et al., 1999a,b, 2001; Frank, 2000; Sadler et al., 2000; Frank et al., 2003) for automatically annotating treebank resources with LFG f-structure information. Until recently, however, most of this work on automatic f-structure annotation has been applied only to limited data sets, so while it may have shown ‘proof of concept’, it has not yet demonstrated that the techniques developed scale up to much larger data sets. More recent work (Cahill et al., 2002a,b) has presented efforts in evolving and scaling techniques established in these previous papers to the full Penn-II Treebank (Marcus et al., 1994). In this paper, we present a number of quantitative and qualitative evaluation experiments which provide insights into the effectiveness of the techniques developed to automatically derive a set of f-structures for the more than 1,000,000 words and 49,000 sentences of Penn-II. Currently we obtain 94.85% Precision, 95.4% Recall and 95.09% F-Score for preds-only f-structures against a manually encoded gold standard.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluating Automatic LFG F-Structure Annotation for the Penn-II Treebank

Abstract

Talk to us

Similar Papers

More From: Research on Language and Computation

Lead the way for us

Journal: Research on Language and Computation	Publication Date: Dec 1, 2004
Citations: 36

Similar Papers

Protein Identification False Discovery Rates for Very Large Proteomics Data Sets Generated by Tandem Mass Spectrometry
Lukas Reiter ... Ruedi Aebersold
Molecular & Cellular Proteomics | VOL. 8
Lukas Reiter, et. al.Lukas Reiter ... Ruedi Aebersold
01 Nov 2009
Molecular & Cellular Proteomics | VOL. 8

A clustering method for very large mixed data sets
G Sanchez-Diaz ... J Ruiz-Shulcloper
-
G Sanchez-Diaz, et. al.G Sanchez-Diaz ... J Ruiz-Shulcloper
29 Nov 2001
29 Nov 2001

Treebank-Based Acquisition of Multilingual Unification Grammar Resources
Aoife Cahill ... Michael Burke
Research on Language and Computation | VOL. 3
Aoife Cahill, et. al.Aoife Cahill ... Michael Burke
01 Jul 2005
Research on Language and Computation | VOL. 3

When Good-Enough is Enough: Complex Queries at Fixed Cost
Nathan D Mickulicz ... Rajeev Gandhi
-
Nathan D Mickulicz, et. al.Nathan D Mickulicz ... Rajeev Gandhi
01 Mar 2015
01 Mar 2015

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluating Automatic LFG F-Structure Annotation for the Penn-II Treebank

Abstract

Talk to us

Similar Papers

More From: Research on Language and Computation