Abstract

Measuring the semantic similarity between words is important for natural language processing tasks. The traditional models of semantic similarity perform well in most cases, but when dealing with words that involve geographical context, spatial semantics of implied spatial information are rarely preserved. Geographic information retrieval (GIR) methods have focused on this issue; however, they sometimes fail to solve the problem because the spatial and textual similarities of words are considered and calculated separately. In this paper, from the perspective of spatial context, we consider the two parts as a whole—spatial context semantics, and we propose a method that measures spatial semantic similarity using a sliding geospatial context window for geo-tagged words. The proposed method was first validated with a set of simulated data and then applied to a real-world dataset from Flickr. As a result, a spatial semantic similarity model at different scales is presented. We believe this model is a necessary supplement for traditional textual-language semantic analyses of words obtained by word-embedding technologies. This study has the potential to improve the quality of recommendation systems by considering relevant spatial context semantics, and benefits linguistic semantic research by emphasising the spatial cognition among words.

Highlights

  • With the recent advancements in artificial intelligence (AI) and computational linguistics, natural language processing (NLP) has attracted considerable attention, and the requirements for representing human-computer interactions and senses have increased [1]

  • When dealing with unstructured content that contains deep background meaning, such as addressing the semantic similarity in geo-related information retrieval (IR) tasks, the semantic-based similarity measurements based on plain text yield poor performances [13,14,15,16,17], such as similarities between beer-smile, club-beer and more spatially and impliedly related pair of words, which are contributed for optimizing and expanding the query results of geographic recommendation system and geographic search system

  • A geospatial context window-based method is proposed in this paper to measure the spatial semantic similarity (s-SIM) of words

Read more

Summary

Introduction

With the recent advancements in artificial intelligence (AI) and computational linguistics, natural language processing (NLP) has attracted considerable attention, and the requirements for representing human-computer interactions and senses have increased [1]. The semantic similarity between two words can be measured by calculating the distance between their vectors These models are based on statistical inferences of large corpora under the assumption that words with similar distributional properties in the same context have similar semantic meanings [11]. When the semantic similarities are measured by these models based on the co-occurrences of words in the corpora, the ‘true’ understanding of the words is unobtainable [7]; namely, the purely text-based approaches fail when processing information with complex reasoning [12]. These models are sufficient to handle most common scenarios for a general corpus.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call