Abstract

Geocoding converts unstructured geographic text into structured spatial data, which is crucial in fields such as urban planning, social media spatial analysis, and emergency response systems. Existing approaches predominantly model geocoding as a geographic grid classification task but struggle with the output space dimensionality explosion as the grid granularity increases. Furthermore, these methods generally overlook the inherent hierarchical structure of geographical texts and grids. In this paper, we propose a hierarchy-aware geocoding model based on cross-attention within the Seq2Seq framework, incorporating S2 geometry to model geocoding as a task for generating grid labels and predicting S2 tokens (labels of S2 grids) character-by-character. By incorporating a cross-attention mechanism into the decoder, the model dynamically perceives the address contexts at the hierarchical level that are most relevant to the current character prediction based on the input address text. Results show that the proposed model significantly outperforms previous approaches across multiple metrics, with a median and mean distance error of 41.46 m and 93.98 m, respectively. Furthermore, our method achieves superior results compared to others in regions with sparse data distribution, reducing the median and mean distance error by 16.27 m and 7.52 m, respectively, suggesting that our model has effectively mitigated the issue of insufficient learning in such regions.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call