We present theoretical and experimental results for spatiotemporal graph k-means (STGkM)—a new unsupervised method to cluster vertices within a dynamic network. STGkM finds both short-term dynamic clusters and a “long-lived” partition of vertices within a network whose topology is evolving over time; we first introduced this technique in a recent conference paper. Here, we update our algorithm with a more efficient relaxation scheme, provide additional theoretical results, compare its performance to several other methods, and demonstrate its capabilities on real, diverse datasets. We construct a theoretical foundation to distinguish STGkM from connected components and static clustering and prove results for the stochastic setting for the first time. In addition to our previous experiments on the United States House of Representatives dataset, we report new state-of-the-art empirical results on a dynamic scientific citation network and Reddit dataset. These findings demonstrate that STGkM is accurate, efficient, informative, and operates well in diverse settings. Finally, as previously noted, one of the main advantages of STGkM is that it has only one required parameter: k, the number of clusters; we therefore include an extended analysis of the range of this parameter and guidance on selecting its optimal value. Our data and code are available on Github; see: https://github.com/dynestic/stgkm.
Read full abstract