Abstract

Geotagging is the process of recognizing place and facility names in a document, and assigning each set of latitude and longitude values. In the latter step, an external geographic database, which contains pairs of place/facility names and latitude/longitude values, is used. However, if former place/facility names are used in a historical document, it is impossible to assign latitude and longitude values to them, even though their current names are listed in the database. Furthermore, if there are multiple identical place/facility names in the geographical database, we will have to choose the correct one. In this paper, we propose a method to construct a database that contains current and former place/facility name pairs. We applied a machine learning-based information extraction method to some text corpora, and automatically extracted current and former place/facility name pairs. We also propose a method that disambiguates the same place/facility names. We conducted some experiments to confirm the effectiveness of our method.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.