Abstract

Recognizing temporal relations among events and time expressions has been an essential but challenging task in natural language processing. Conventional annotation of judging temporal relations puts a heavy load on annotators. In reality, the existing annotated corpora include annotations on only “salient” event pairs, or on pairs in a fixed window of sentences. In this paper, we propose a new approach to obtain temporal relations from absolute time value (a.k.a. time anchors), which is suitable for texts containing rich temporal information such as news articles. We start from time anchors for events and time expressions, and temporal relation annotations are induced automatically by computing relative order of two time anchors. This proposal shows several advantages over the current methods for temporal relation annotation: it requires less annotation effort, can induce inter-sentence relations easily, and increases informativeness of temporal relations. We compare the empirical statistics and automatic recognition results with our data against a previous temporal relation corpus. We also reveal that our data contributes to a significant improvement of the downstream time anchor prediction task, demonstrating 14.1 point increase in overall accuracy.

Highlights

  • Temporal information extraction is becoming an active research field in natural language processing (NLP) due to the rapidly growing need for NLP applications such as timeline generation and question answering (Llorens et al, 2015; Meng et al, 2017)

  • The classification results of TORDERs and temporal links (TLINKs) are not directly comparable, they can show some evidence whether TORDERs is functional to provide temporal order information

  • We propose a new approach to obtain temporal relations based on time anchors of mentions in news articles

Read more

Summary

Introduction

Temporal information extraction is becoming an active research field in natural language processing (NLP) due to the rapidly growing need for NLP applications such as timeline generation and question answering (Llorens et al, 2015; Meng et al, 2017). TimeBank (Pustejovsky et al, 2003) is the first widely used corpus with temporal information annotated in the NLP community It contains 183 news articles that have been annotated with events, time expressions and temporal relations between events and time expressions. Subsequent TempEval-1,2,3 competitions (Verhagen et al, 2009, 2010; UzZaman et al, 2012) mostly relied on TimeBank, and aimed to improve coverage by annotating relations between all events and time expressions in the same sentence. Cassidy et al (2014) proposed a compulsory mechanism to force annotators to label every pair in a given sentence window They performed the annotation (TimeBankDense) on a subset (36 documents) of TimeBank, which achieved a denser corpus with 6.3 TLINKs per event and time expression, comparing to 0.7 in the original TimeBank corpus. It raises the issue that hand-labeling all dense TLINKs is extremely time-consuming and the unclear definition of “salient” is not improved at all

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.