Application domains such as environmental health science, climate science, and geosciences—where the relationship between humans and the environment is studied—are constantly evolving and require innovative approaches in geospatial data analysis. Recent technological advancements have led to the proliferation of high-granularity geospatial data, enabling such domains but posing major challenges in managing vast datasets that have high spatiotemporal similarities. We introduce the Hierarchical Grid Partitioning (HierGP) framework to address this issue. Unlike conventional discrete global grid systems, HierGP dynamically adapts to the data’s inherent characteristics. At the core of our framework is the Map Point Reduction (MPR) algorithm, designed to aggregate and then collapse data points based on user-defined similarity criteria. This effectively reduces data volume while preserving essential information. The reduction process is particularly effective in handling environmental data from extensive geographical regions. We structure the data into a multilevel hierarchy from which a reduced representative dataset can be extracted. We compare the performance of HierGP against several state-of-the-art geospatial indexing algorithms and demonstrate that HierGP outperforms the existing approaches in terms of runtime, memory footprint, and scalability. We illustrate the benefits of the HierGP approach using two representative applications: analysis of over 289 million location samples from a registry of participants and efficient extraction of environmental data from large polygons. While the application demonstration in this work has focused on environmental health, the methodology of the HierGP framework can be extended to explore diverse geospatial analytics domains.
Read full abstract