Abstract

Spatial co-location pattern is a subset of spatial features whose instances are frequently located together in geography. Mining co-location patterns are particularly valuable for discovering spatial dependencies. Traditional co-location pattern mining algorithms are computationally expensive with rapidly increasing of data volume. In this paper, we explore a novel iterative framework based on parallel ordered-clique-growth for co-location pattern mining. The ordered clique extension can re-use previously processed information and be executed in parallel, and hence speed up the identification of co-location instances. Based on the iterative framework, a MapReduce algorithm is designed to search for prevalent co-location patterns in a level-wise manner, namely PCPM_OC. To narrow the search space of ordered cliques, two pruning techniques are suggested for filtering invalid clique instances as much as possible. The completeness and correctness of PCPM_OC are proven and we also discuss its complexity in this paper. Moreover, we compare PCPM_OC with two advanced MapReduce based co-location pattern mining algorithms on multiple perspectives. At last, substantial experiments are conducted on synthetic and real-world spatial datasets to study the performance of PCPM_OC. Experimental results demonstrate that PCPM_OC has a significant improvement in efficiency and shows better scalability on massive spatial data.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call