MCHT: A maximal clique and hash table-based maximal prevalent co-location pattern mining algorithm

Vanha Tran,Lizhen Wang,Hongmei Chen,Qing Xiao

doi:10.1016/j.eswa.2021.114830

Abstract

Co-location patterns refer to subsets of Boolean spatial features with instances of these features frequently appear in nearby geographic space. Maximal co-location patterns are a compact representation of these patterns that lead users more easily to absorb results and make meaningful inferences. The current algorithms for maximal co-location pattern mining are based on a generate-test candidate model. The main execution time of this model is occupied by collecting co-location instances of candidates, which makes discovering maximal co-location patterns is still very challenging when data is big and/or dense. To take up the challenge, a novel maximal co-location pattern mining framework based on maximal cliques and hash tables (MCHT) is developed in this study. First, all maximal cliques that can compactly represent neighbor relationships between instances of a spatial data set are enumerated. The advantages of bit string operations are fully utilized to speed up the process of enumerating maximal cliques. Next, a participating instance hash table structure is constructed based on these maximal cliques. Then information about the co-location instances of maximal patterns can be queried and collected efficiently from the hash table. After that, by calculating participation indexes of these patterns to measure their prevalence, maximal prevalent co-location patterns can be filtered efficiently. Finally, a series of experiments is conducted on both synthetic and real-facility data sets to demonstrate that the proposed algorithm can efficiently reduce both the computational time and the memory consumption compared with the existing algorithms.

Full Text