Abstract

Existing spatial keyword query processing models mainly consider the spatial proximity and text relevancy between spatial objects and spatial keyword query, which usually makes the top- k answer objects are similar to each other. However, the user hopes to obtain the top- k results that are typical and semantically related to his/her query intention. This paper proposes a top- k spatial keyword typicality and sematic querying approach which can expeditiously provide top- k typical and semantically related objects to the given query. The approach consists of two processing steps. During the offline step, we first analyze the location-semantic relationships between spatial objects by considering both the location similarity and document semantic relevancy between them. For measuring the semantic similarity between documents associated to the spatial objects, we propose two methods, the keyword coupling relationship-based document similarity measure and the Word2Vec-CNN-based document similarity measure. Then, the Gaussian probabilistic density-based estimation method is leveraged to find a few representative objects from the dataset and then the order/permutation of remaining objects in the dataset can be generated corresponding to each representative object. The objects in the permutation are ranked in descending order according to their location-semantic relationships to the representative object. When a spatial keyword query coming, the online processing step first computes the spatial proximity and semantic relevancy between the query and each representative object, and then a small number of orders generated in the offline step can be selected and used at querying time to facilitate top- k typical and semantically related object selection by using the threshold algorithm (TA). Results of a preliminary user study demonstrate our location-semantic relationship measuring method can capture the location similarity and semantic relevancy between spatial objects accurately. The efficiency of typicality analysis and TA-based top- k selection algorithm is also demonstrated.

Highlights

  • With the rapid development of GPS and universal use of mobile internet, more and more spatial objects are becoming available on the Web that represent Point of Interests (POIs) such as restaurant, hotels, cafes, and tourist attractions

  • For measuring the semantic similarity between documents associated to the spatial objects, we propose two methods, the keyword coupling relationship-based document similarity measure and the Word2Vec-convolutional neural network (CNN)-based document similarity measure

  • Step 1 (Find Representative Objects): Based on the location-semantic relationships between different pairs of spatial objects, we provide a method, which is inspired by the typicality estimation algorithm proposed in [14], to find the representative objects

Read more

Summary

Introduction

With the rapid development of GPS and universal use of mobile internet, more and more spatial objects (usually containing the geo-textual information) are becoming available on the Web that represent Point of Interests (POIs) such as restaurant, hotels, cafes, and tourist attractions. These POIs mainly consist of two types of information, the geo-location (in the form of longitude and latitude) and the textual document (such as names, amenities, and special features, etc.) [5], [8], [27]. The spatial database usually contains a large size of data and too many answer problem which is referred to ‘‘information overload’’ often occurs when a user issues a non-selective spatial keyword query [6], [20], [26], [35]

Objectives
Methods
Findings
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.