Abstract

The Web is a large source of geographic information. Many Web documents have one or more spatial references, such as place names, addresses, zip codes or phone numbers. These spatial references are usually found in a semistructured fashion, which allows humans to identify and assign a geographic meaning to documents. In this paper, we discuss the important role that gazetteers, which are spatial catalogues of place names, can play in automating this process, and introduce the Locus gazetteer. Locus has been designed to hold not only place names for entities such as cities and rivers, but also to handle intra-urban place names, such as street names, urban landmarks, and postal addresses, along with their spatial relationships, through an ontology of places. We demonstrate that ontologically-enhanced gazetteers, such as Locus, are very useful for discovering the geographic context present on Web pages, and are often used in many other applications, such as in address geocoding for geographic information systems. To efficiently accomplish these tasks, the gazetteer must have a large database of spatial references; however, such a database is hard to obtain in emergent countries such as Brazil, in which available official geographic databases are limited and not well updated. As a way to tackle this problem, we describe a semi-automatic method used to populate the Locus gazetteer with geographic content extracted directly from the Web. To evaluate our work, an experiment was conducted, focusing on testing the Locus gazetteer data quality and comprehensiveness.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call