Data about the movements of diverse objects, including human beings, animals, and commodities, are collected in growing amounts as location-aware technologies become pervasive. Clustering has become an increasingly important analytical tool for revealing travel patterns from large-scale movement datasets. Most existing methods for origin-destination (OD) flow clustering focus on the geographic properties of an OD flow but ignore the temporal information preserved in the OD flow, which reflects the dynamic changes in the travel patterns over time. In addition, most methods require some predetermined parameters as inputs and are difficult to adjust considering the changes in the users’ demands. To overcome such limitations, we present a novel OD flow clustering method, namely, TOCOFC (Tree-based and Optimum Cut-based Origin-Destination Flow Clustering). A similarity measurement method is proposed to quantify the spatial similarity relationship between OD flows, and it can be extended to measure the spatiotemporal similarity between OD flows. By constructing a maximum spanning tree and splitting it into several unrelated parts, we effectively remove the noise in the flow data. Furthermore, a recursive two-way optimum cut-based method is utilized to partition the graph composed of OD flows into OD flow clusters. Moreover, a criterion called CSSC (Child tree/Child graph Self-Similarity Criterion) is formulated to determine if the clusters meet the output requirements. By modifying the parameters, TOCOFC can obtain clustering results for different time scales and spatial scales, which makes it possible to study movement patterns from a multiscale perspective. However, TOCOFC has the disadvantages of low efficiency and large memory consumption, and it is not conducive to quickly handling large-scale data. Compared with previous works, TOCOFC has a better clustering performance, which is reflected in the fact that TOCOFC can guarantee a balance between clusters and help to fully understand the corresponding patterns. Being able to perform the spatiotemporal clustering of OD flows is also a highlight of TOCOFC, which will help to capture the differences in the patterns at different times for a deeper analysis. Extensive experiments on both artificial spatial datasets and real-world spatiotemporal datasets have demonstrated the effectiveness and flexibility of TOCOFC.
Read full abstract