High‐performance implementation of a two‐bit geohash coding technique for nearest neighbor search

Varalakshmi M,Amit P. Kesarkar,Daphne Lopez

doi:10.1002/cpe.6029

Varalakshmi M, Amit P. Kesarkar + Show 1 more

https://doi.org/10.1002/cpe.6029

Copy DOI

Export

Save

Cite

Abstract
Full-Text
Similar Papers

Abstract

Listen

SummaryInsights from geohash coding algorithms introduce significant opportunities for various spatial applications. However, these algorithms require massive storage, complex bit manipulation, and extensive code modification when scaled to higher dimensions. In this article, we have developed a two‐bit geohash coding algorithm that divides the search space into four equal partitions where each partition is assigned a two‐bit label as 00, 01, 10, and 11, which helps to uniquely identify a chosen data point and the two neighbors on its either side, taken along a particular dimension. This salient feature of the algorithm simplifies the generation of geohash code for the neighboring grid cells. In addition, it achieves efficient memory utilization by storing the geohash values of the training points as integers. Demonstrated by experiments for climate data assimilation, model‐to‐observation space mapping with a geohash code length of 24 bits for Lat‐Lon extent of India has shown favorable results with an accuracy of 85%. Performance and scalability evaluation of the proposed algorithm, optimized for multicore and many‐core processors has shown significant speedups outperforming a tree‐based approach. This algorithm provides a foundation for new spatial statistical methods that can be used for pattern discovery and detection in spatial big data.

Full Text