Mining co-location patterns hidden in spatial data is crucial for spatial association discovery, and it has broad prospects in many applications. High Utility Co-location Pattern Mining (HUCPM) further takes the utility factor of spatial features into consideration, so it is more realistic compared with the traditional co-location pattern mining. However, HUCPM is more difficult, since the Apriori-like pruning technique does not apply. To address this problem, we firstly suggest two novel pruning strategies to trim the pattern search space. Then, a series of optimizing techniques are presented to speed up the pattern utility ratio calculation of each candidate. Based on above techniques, a fast HUCPM algorithm is proposed, which searches for high utility co-locations involved in each pattern branch via a depth-extending manner and equips with a heuristic strategy to enhance the effect of pruning techniques. Moreover, we theoretically prove the completeness and correctness of the proposed algorithm, and discuss its algorithmic complexity. On multiple spatial datasets, we conduct substantial experiments to reveal the superiority of our algorithm in efficiency and scalability, as well as the effectiveness of the proposed technique. Particularly, the proposed algorithm in this paper runs faster than other baselines for several times to several orders of magnitude.
Read full abstract