Abstract

Event detection (ED), which means identifying event trigger words and classifying event types, is the first and most fundamental step for extracting event knowledge from plain text. Most existing datasets exhibit the following issues that limit further development of ED: (1) Data scarcity. Existing small-scale datasets are not sufficient for training and stably benchmarking increasingly sophisticated modern neural methods. (2) Low coverage. Limited event types of existing datasets cannot well cover general-domain events, which restricts the applications of ED models. To alleviate these problems, we present a MAssive eVENt detection dataset (MAVEN), which contains 4,480 Wikipedia documents, 118,732 event mention instances, and 168 event types. MAVEN alleviates the data scarcity problem and covers much more general event types. We reproduce the recent state-of-the-art ED models and conduct a thorough evaluation on MAVEN. The experimental results show that existing ED methods cannot achieve promising results on MAVEN as on the small datasets, which suggests that ED in the real world remains a challenging task and requires further research efforts. We also discuss further directions for general domain ED with empirical analyses. The source code and dataset can be obtained from https://github.com/THU-KEG/MAVEN-dataset.

Highlights

  • Event detection (ED) is an important task of information extraction, which aims to identify event triggers and classify event types

  • We present MAssive eVENt detection dataset (MAVEN), a humanannotated massive general domain event detection dataset constructed from English Wikipedia and FrameNet (Baker et al, 1998), which can alleviate the data scarcity and low coverage problems: (1) Our MAVEN dataset contains 111, 611 different events, 118, 732 event mentions, which is twenty times larger than the most widely-used ACE 2005 dataset, and 4, 480 annotated documents in total

  • We show the main statistics of MAVEN and compare them with some existing widely-used ED datasets in Table 2, including the most widelyused ACE 2005 dataset (Walker et al, 2006) and a series of Rich ERE annotation datasets provided by TAC KBP competition, which are DEFT Rich ERE English Training Annotation V2 (LDC2015E29), DEFT Rich ERE English Training Annotation R2 V2 (LDC2015E68), DEFT Rich ERE Chinese and English Parallel Annotation V2 (LDC2015E78), TAC KBP Event Nugget Data 2014-2016 (LDC2017E02) (Ellis et al, 2014, 2015, 2016) and TAC KBP 2017 (LDC2017E55) (Getman et al, 2017)

Read more

Summary

Introduction

Event detection (ED) is an important task of information extraction, which aims to identify event triggers (the words or phrases evoking events in text) and classify event types. In the sentence “Bill Gates founded Microsoft in 1975”, an ED model should recognize that the word “founded” is the trigger of a Found event. As event annotation is complex and expensive, the existing datasets are mostly small-scale. The most widely-used ACE 2005 English dataset (Walker et al, 2006) only contains 599 documents and 5, 349 annotated instances. Due to the inherent data imbalance problem, 20 of its 33 event types only have fewer than 100 annotated instances. As recent neural methods are typically data-hungry, these small-scale datasets are not sufficient for training and stably benchmarking mod-

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call