Assessment of NER solutions against the first and second CALBC Silver Standard Corpus

Dietrich Rebholz‐Schuhmann ,Qi Wei,Kerstin Hornbostel,Ian Lewin,Nigel Collier,Walter Daelemans,Alexandre Kouznetsov,György Móra,Kazuo Hara,René Witte,Rafael Berlanga,Fabio Rinaldi,Udo Hahn,Vincent Van Asch,Mariana Neves,David Milward,Richárd Farkas,Chen Li,Ning Kang,Michael Rautschka,Ekaterina Buyko,Roser Morante,Alberto Lavelli,Alberto Pascual-Montano,Erik M Van Mulligen ,Elena Beißwanger ,Faisal Mahbub Chowdhury ,Jonas Bergman Laurila ,Christopher J O Baker ,Antonio Jimeno-Yepes ,Peter Corbett,Laura I Furlong ,Chung‐Hao Kuo ,Jan A Kors ,José Luís Marina ,Şenay Kafkas ,Simon Clematide

doi:10.1186/2041-1480-2-s5-s11

Abstract

BackgroundCompetitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC). The preparation of the GSC is time-consuming and costly and the final corpus consists at the most of a few thousand documents annotated with a limited set of semantic groups. To overcome these shortcomings, the CALBC project partners (PPs) have produced a large-scale annotated biomedical corpus with four different semantic groups through the harmonisation of annotations from automatic text mining solutions, the first version of the Silver Standard Corpus (SSC-I). The four semantic groups are chemical entities and drugs (CHED), genes and proteins (PRGE), diseases and disorders (DISO) and species (SPE). This corpus has been used for the First CALBC Challenge asking the participants to annotate the corpus with their text processing solutions.ResultsAll four PPs from the CALBC project and in addition, 12 challenge participants (CPs) contributed annotated data sets for an evaluation against the SSC-I. CPs could ignore the training data and deliver the annotations from their genuine annotation system, or could train a machine-learning approach on the provided pre-annotated data. In general, the performances of the annotation solutions were lower for entities from the categories CHED and PRGE in comparison to the identification of entities categorized as DISO and SPE. The best performance over all semantic groups were achieved from two annotation solutions that have been trained on the SSC-I.The data sets from participants were used to generate the harmonised Silver Standard Corpus II (SSC-II), if the participant did not make use of the annotated data set from the SSC-I for training purposes. The performances of the participants’ solutions were again measured against the SSC-II. The performances of the annotation solutions showed again better results for DISO and SPE in comparison to CHED and PRGE.ConclusionsThe SSC-I delivers a large set of annotations (1,121,705) for a large number of documents (100,000 Medline abstracts). The annotations cover four different semantic groups and are sufficiently homogeneous to be reproduced with a trained classifier leading to an average F-measure of 85%. Benchmarking the annotation solutions against the SSC-II leads to better performance for the CPs’ annotation solutions in comparison to the SSC-I.

Highlights

Competitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC)
The first CALBC Challenge is similar in the sense that the project partners (PPs) of the CALBC project provided an annotated corpus to the challenge participants (CPs) of the first CALBC challenge to reproduce the annotations with automatic means
The first CALBC Challenge was different to the before-mentioned challenges with regards to the following modifications: (1) the annotated corpus has been generated automatically and not manually (Silver Standard Corpus, SSC-I), and (2) the size of the SSC-I is significantly bigger than the corpora mentioned produced for the other challenges, i.e. the annotated corpus contains 50,000 Medline abstracts for training and the corpus for annotation consists of 100,000 test documents

Summary

Introduction

Competitions in text mining have been used to measure the performance of automatic text processing solutions against a manually annotated gold standard corpus (GSC). The first CALBC Challenge was different to the before-mentioned challenges with regards to the following modifications: (1) the annotated corpus has been generated automatically and not manually (Silver Standard Corpus, SSC-I), and (2) the size of the SSC-I is significantly bigger than the corpora mentioned produced for the other challenges, i.e. the annotated corpus contains 50,000 Medline abstracts for training and the corpus for annotation consists of 100,000 test documents This difference in size requires that all assessment is performed fully automatically, that the CPs apply annotation solutions that can cope with such a large-scale corpus and that the assessment solutions can evaluate the contributions in a short period of time. Overall these annotations should have the characteristic that all annotation solutions show high performance against the set of annotations, for example when measuring the F-measure of the annotation solution [6]

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Biomedical Semantics	Publication Date: Jan 1, 2011
Citations: 65	License type: cc-by

R Discovery Prime

R Discovery Prime

Assessment of NER solutions against the first and second CALBC Silver Standard Corpus

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics

Lead the way for us

Similar Papers

The CALBC RDF Triple store: retrieval over large literature content
Chen Li ... Christoph Grabmüller
Nature Precedings | VOL. -
Chen Li, et. al.Chen Li ... Christoph Grabmüller
18 Jan 2011
Nature Precedings | VOL. -

The CALBC RDF Triple store: retrieval over large literature content
Samuel Croset ... Chen Li
Nature Precedings | VOL. -
Samuel Croset, et. al.Samuel Croset ... Chen Li
18 Jan 2011
Nature Precedings | VOL. -

The CALBC RDF Triple store: retrieval over large literature content
Samuel Croset
Nature Precedings | VOL. -
Samuel CrosetSamuel Croset
13 Dec 2010
Nature Precedings | VOL. -

Training text chunkers on a silver standard corpus: can silver replace gold?
Ning Kang ... Jan A Kors
BMC Bioinformatics | VOL. 13
Ning Kang, et. al.Ning Kang ... Jan A Kors
30 Jan 2012
BMC Bioinformatics | VOL. 13

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Assessment of NER solutions against the first and second CALBC Silver Standard Corpus

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Biomedical Semantics