An Approach to Measuring Semantic Relatedness of Geographic Terminologies Using a Thesaurus and Lexical Database Sources

Zugang Chen,Jia Song,Yaping Yang

doi:10.3390/ijgi7030098

Abstract

In geographic information science, semantic relatedness is important for Geographic Information Retrieval (GIR), Linked Geospatial Data, geoparsing, and geo-semantics. But computing the semantic similarity/relatedness of geographic terminology is still an urgent issue to tackle. The thesaurus is a ubiquitous and sophisticated knowledge representation tool existing in various domains. In this article, we combined the generic lexical database (WordNet or HowNet) with the Thesaurus for Geographic Science and proposed a thesaurus–lexical relatedness measure (TLRM) to compute the semantic relatedness of geographic terminology. This measure quantified the relationship between terminologies, interlinked the discrete term trees by using the generic lexical database, and realized the semantic relatedness computation of any two terminologies in the thesaurus. The TLRM was evaluated on a new relatedness baseline, namely, the Geo-Terminology Relatedness Dataset (GTRD) which was built by us, and the TLRM obtained a relatively high cognitive plausibility. Finally, we applied the TLRM on a geospatial data sharing portal to support data retrieval. The application results of the 30 most frequently used queries of the portal demonstrated that using TLRM could improve the recall of geospatial data retrieval in most situations and rank the retrieval results by the matching scores between the query of users and the geospatial dataset.

Highlights

Semantic similarity relies on similar attributes and relations between terms, whilst semantic relatedness is based on the aggregate of interconnections between terms [1]
We only review the literature of semantic similarity/relatedness measures based on the thesaurus knowledge sources or in the context of GIScience
Given that terminologies deal with professional knowledge, we should conduct the survey with geography experts

Summary

Introduction

Semantic similarity relies on similar attributes and relations between terms, whilst semantic relatedness is based on the aggregate of interconnections between terms [1]. Semantic similarity is a subset of semantic relatedness: all similar terms are related, but related terms are not necessarily similar [2]. “river” and “stream” are semantically similar, while “river” and “boat” are dissimilar but semantically related [3]. Semantic similarity/relatedness measures are used to solve problems in a broad range of applications and domains. The domains of application include: (i) Natural Language Processing, (ii) Knowledge Engineering/Semantic Web and Linked Data [5], (iii) Information retrieval, (iv) Artificial intelligence [6], and so on. To accurately present our research, our study is restricted to semantic relatedness

Results

Discussion

Conclusion