Abstract

Recent advances in knowledge graphs show great promise to link various data together to provide a semantic network. Place is an important part in the big picture of the knowledge graph since it serves as a powerful glue to link any data to its georeference. A key technical challenge in constructing knowledge graphs with location nodes as geographical references is the matching of place entities. Traditional methods typically rely on rule-based matching or machine-learning techniques to determine if two place names refer to the same location. However, these approaches are often limited in the feature selection of places for matching criteria, resulting in imbalanced consideration of spatial and semantic features. Deep feature-based methods such as deep learning methods show great promise for improved place data conflation. This paper introduces a Semantic-Spatial Aware Representation Learning Model (SSARLM) for Place Matching. SSARLM liberates the tedious manual feature extraction step inherent in traditional methods, enabling an end-to-end place entity matching pipeline. Furthermore, we introduce an embedding fusion module designed for the unified encoding of semantic and spatial information. In the experiment, we evaluate the approach to named places from Guangzhou and Shanghai cities in GeoNames, OpenStreetMap (OSM), and Baidu Map. The SSARLM is compared with several classical and commonly used binary classification machine learning models, and the state-of-the-art large language model, GPT-4. The results demonstrate the benefit of pre-trained models in data conflation of named places.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call