Lexical association measures and collocation extraction

Pavel Pecina

doi:10.1007/s10579-009-9101-4

Abstract

We present an extensive empirical evaluation of collocation extraction methods based on lexical association measures and their combination. The experiments are performed on three sets of collocation candidates extracted from the Prague Dependency Treebank with manual morphosyntactic annotation and from the Czech National Corpus with automatically assigned lemmas and part-of-speech tags. The collocation candidates were manually labeled as collocational or non-collocational. The evaluation is based on measuring the quality of ranking the candidates according to their chance to form collocations. Performance of the methods is compared by precision-recall curves and mean average precision scores. The work is focused on two-word (bigram) collocations only. We experiment with bigrams extracted from sentence dependency structure as well as from surface word order. Further, we study the effect of corpus size on the performance of the individual methods and their combination.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Lexical association measures and collocation extraction

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation

Lead the way for us

Journal: Language Resources and Evaluation	Publication Date: Oct 21, 2009
Citations: 335

Similar Papers

TermeX: A Tool for Collocation Extraction
Davor Delač ... Bojana Dalbelo Bašić
-
Davor Delač, et. al.Davor Delač ... Bojana Dalbelo Bašić
01 Jan 2009
01 Jan 2009

Combination of the Manifold Dimensionality Reduction Methods with Least Squares Support vector machines for Classifying the Species of Sorghum Seeds.
Y M Chen ... P Lin
Scientific Reports | VOL. 6
Y M Chen, et. al.Y M Chen ... P Lin
28 Jan 2016
Scientific Reports | VOL. 6

Evolving new lexical association measures using genetic programming
Jan Šnajder ... Bojana Dalbelo Bašić
-
Jan Šnajder, et. al.Jan Šnajder ... Bojana Dalbelo Bašić
01 Jan 2008
01 Jan 2008

Point-Based Weakly Supervised Learning for Object Detection in High Spatial Resolution Remote Sensing Images
Youyou Li ... Binbin He
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 14
Youyou Li, et. al.Youyou Li ... Binbin He
01 Jan 2020
IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing | VOL. 14

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Lexical association measures and collocation extraction

Abstract

Talk to us

Similar Papers

More From: Language Resources and Evaluation