Abstract

The tremendous advance in information technology has promoted the rapid development of location-based services (LBSs), which play an indispensable role in people’s daily lives. Compared with a traditional LBS based on Point-Of-Interest (POI), which is an isolated location point, an increasing number of demands have concentrated on Region-Of-Interest (ROI) exploration, i.e., geographic regions that contain many POIs and express rich environmental information. The intention behind the POI is to search the geographical regions related to the user’s requirements, which contain some spatial objects, such as POIs and have certain environmental characteristics. In order to achieve effective ROI exploration, we propose an ROI top-k keyword query method that considers the environmental information of the regions. Specifically, the Word2Vec model has been introduced to achieve the distributed representation of POIs and capture their environmental semantics, which are then leveraged to describe the environmental characteristic information of the candidate ROI. Given a keyword query, different query patterns are designed to measure the similarities between the query keyword and the candidate ROIs to find the k candidate ROIs that are most relevant to the query. In the verification step, an evaluation criterion has been developed to test the effectiveness of the distributed representations of POIs. Finally, after generating the POI vectors in high quality, we validated the performance of the proposed ROI top-k query on a large-scale real-life dataset where the experimental results demonstrated the effectiveness of our proposals.

Highlights

  • Recent years have witnessed the rapid development of Internet technologies and sensor devices, which, in turn, has resulted in the explosive growth of geo-related information

  • With the grids viewed as the candidate ROIs, the ROI vectors were obtained by their internal POI vectors, which implies the type characteristics and spatial distribution of each candidate ROI

  • Considering that some infrequent POIs tend to have a negative impact on the function and type of the ROI, term frequency-inverse document frequency (TF-IDF), a common method in information retrieval that can evaluate the importance of words in a corpus, was utilized to adjust the weights of POI types in each ROI

Read more

Summary

Introduction

Recent years have witnessed the rapid development of Internet technologies and sensor devices, which, in turn, has resulted in the explosive growth of geo-related information. The existing ROI exploration methods are mainly based on the statistical information or the density information of the query elements [6,7,8,9], such as POIs with certain keywords [10,11], while neglecting the influence of internal and environmental characteristics of the region. In order to achieve an efficient query of the relevant ROI, we executed a grid division on the whole research region so that each grid contained a certain number of POIs. With the grids viewed as the candidate ROIs, the ROI vectors were obtained by their internal POI vectors, which implies the type characteristics and spatial distribution of each candidate ROI.

Related Works
Problem Statement
POI Embedding
Corpus Construction
Training POI Vectors by the Skip-Gram Model
Correlation Analysis of the POI Vectors
Candidate ROI Vector Generation
Grid Division
TF-IDF Method
Gaussian Kernel
Multi-Keyword Query Mode
Experiment and Results
Training POI Vectors and Parameter Selection
ROI Keyword Query Research
Settings
Dataset
Query Examples
Compared Methods
Evaluation Metric
Performance Comparison
Tuning the Size of the Grids
Time and Space Consumption

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.