Temporal Information Extraction with Cross-Language Projected Data

Przemysław Jarzębowski,Adam Przepiórkowski

doi:10.1007/978-3-642-33983-7_20

Abstract

This paper presents a method used for extracting temporal information from raw texts in Polish. The extracted information consists of the text fragments which describe events, the time expressions and the temporal relations between them. Together with temporal reasoning, it can be used in applications such as question answering or for text summarization and information extraction. First, a bilingual corpus was used to project temporal annotations from English to Polish. This data was further enhanced by manual correction and then used for inducing classifiers based on Conditional Random Fields (CRF) and a Support Vector Machine (SVM). For the evaluation of this task we propose a cross-language method that compares the system’s results with results for different languages. It shows that the temporal relations classifier presented here outperforms the state of the art systems for English when using the macro-average F 1-measure, which is well suited for this multiclass classification task.

Full Text