Abstract

Many text documents are spatiotemporal in nature, i.e. contents of a document can be mapped to a specific time period or location. For example, a news article about the French Revolution can be mapped to year 1789 as time and France as place. Identifying this time period and location associated with the document can be useful for various downstream applications such as document reasoning or spatiotemporal information retrieval. In this paper, temporal entropy with pointwise mutual information (PMI) is proposed to estimate the temporal focus of a document. PMI is used to measure the association of words with time expressions. Moreover, a word's temporal entropy is considered as a weight to its association with a time point and a single time point with the highest overall score is chosen as the focus time of a document. The proposed method is generic in the sense that it can also be applied for spatial focus estimation of documents. In the case of spatial entropy with PMI, PMI is used to calculate the association between words and place entities. The effectiveness of our proposed methods for spatiotemporal focus estimation is evaluated on diverse datasets of text documents. The experimental evaluation confirms the superiority of our proposed temporal and spatial focus estimation methods.

Highlights

  • Mapping contents of a document to a specific time period or location is important for document understanding

  • A key approach in document focus time estimation [1] relies on the association of words with time expressions in documents and the time expression that has a distinctive association with terms in documents is selected as the target

  • We propose spatial entropy with pointwise mutual information (PMI) in order to estimate the focus place of documents

Read more

Summary

Introduction

Mapping contents of a document to a specific time period or location is important for document understanding. Location and time expressions can be used as clues to predict the spatiotemporal focus of unstructured text. A key approach in document focus time estimation [1] relies on the association of words with time expressions in documents and the time expression that has a distinctive association with terms in documents is selected as the target. Words’ contributions to the overall document-time association score are weighted with respect to how strongly associated they are with time points. The scheme depends on the extraction of time expressions. As place entities can be extracted in a similar way, the proposed approach is applicable to focus place estimation of documents

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call