Abstract
Geographic information retrieval (GIR) is nowadays a hot research issue that involves the management of uncertainty and imprecision and the modeling of user preferences and context. Indexing the geographic content of documents implies dealing with the ambiguity, synonymy and homonymy of geographic names in texts. On the other side, the evaluation of queries specifying both content based conditions and spatial conditions on documents’ contents requires representing the vagueness and context dependency of spatial conditions and the personal user's preferences. The spatial condition can be specified linguistically in the query through vague terms such as “close to the North East of Milan’’, whose semantic depends on the user's context and perception of distance. Further, users may want to express queries in which the content condition and the spatial condition have a distinct preference and are combined with a distinct semantics. In this paper, we propose a geographic information retrieval model and a system implementing it that represents both the uncertainty in indexing the geographic documents’ content and the user's context and preferences in evaluating flexible spatial queries. It extracts the geographic content from documents’ text by applying heuristic knowledge coded by bipolar rules which evaluate positive hints and negative hints for the recognition of geographic names in text. Thus, it represents the geographic content of documents by fuzzy footprints, i.e., distinct locations on the earth associated with the text with a distinct degree of significance. Finally, the system allows evaluating two types of queries flexibly combining the content based condition with the spatial condition. The spatial condition is interpreted as the soft constraint “close’’ on the user's perceived distance between the documents’ footprint and query's footprint. For each retrieved document, two relevance scores are computed with respect to the two query conditions that are flexibly combined to generate an overall ranked list of documents. The user can choose the semantic for the combination that can be either an asymmetric “and possibly’’ aggregation between the mandatory content condition and the optional spatial condition, or a compensative “average’’ aggregation, defined as a linear combination of the two conditions; further, a relative preference between the conditions can be specified to achieve personalization and effectiveness. A prototypal geographic information retrieval system, named Geo-Finder, based on this model is described, and its evaluations are discussed.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.