Harnessing Diversity in Crowds and Machines for Better NER Performance

Oana Inel,Lora Aroyo

doi:10.1007/978-3-319-58068-5_18

Abstract

Over the last years, information extraction tools have gained a great popularity and brought significant performance improvement in extracting meaning from structured or unstructured data. For example, named entity recognition (NER) tools identify types such as people, organizations or places in text. However, despite their high F1 performance, NER tools are still prone to brittleness due to their highly specialized and constrained input and training data. Thus, each tool is able to extract only a subset of the named entities (NE) mentioned in a given text. In order to improve NE Coverage, we propose a hybrid approach, where we first aggregate the output of various NER tools and then validate and extend it through crowdsourcing. The results from our experiments show that this approach performs significantly better than the individual state-of-the-art tools (including existing tools that integrate individual outputs already). Furthermore, we show that the crowd is quite effective in (1) identifying mistakes, inconsistencies and ambiguities in currently used ground truth, as well as in (2) a promising approach to gather ground truth annotations for NER that capture a multitude of opinions.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Harnessing Diversity in Crowds and Machines for Better NER Performance

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

A linear classifier based on entity recognition tools and a statistical approach to method extraction in the protein-protein interaction literature
Anália Lourenço ... Michael Conover
BMC Bioinformatics | VOL. 12
Anália Lourenço, et. al.Anália Lourenço ... Michael Conover
03 Oct 2011
BMC Bioinformatics | VOL. 12

Linking Entities from Text to Hundreds of RDF Datasets for Enabling Large Scale Entity Enrichment
Michalis Mountantonakis ... Yannis Tzitzikas
Knowledge | VOL. 2
Michalis Mountantonakis, et. al.Michalis Mountantonakis ... Yannis Tzitzikas
24 Dec 2021
Knowledge | VOL. 2

ADAPTIVE DOMAIN-SPECIFIC NAMED ENTITY RECOGNITION METHOD WITH LIMITED DATA
Ivan DYCHKA ... Olga VEDENIEIEVA
MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES | VOL. -
Ivan DYCHKA, et. al.Ivan DYCHKA ... Olga VEDENIEIEVA
28 Mar 2024
MEASURING AND COMPUTING DEVICES IN TECHNOLOGICAL PROCESSES | VOL. -

Extracting Reproductive Condition and Habitat Information from Text Using a Transformer-based Information Extraction Pipeline
Roselyn Gabud ... Riza Batista-Navarro
Biodiversity Information Science and Standards | VOL. 7
Roselyn Gabud, et. al.Roselyn Gabud ... Riza Batista-Navarro
11 Sep 2023
Biodiversity Information Science and Standards | VOL. 7

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Harnessing Diversity in Crowds and Machines for Better NER Performance

Abstract

Talk to us

Similar Papers