Smart grids collect large volumes of smart meter data in the form of time series, or so-called load patterns. We outline the applications that benefit from analyzing this data (ranging from customer segmentation to operational system planning), and propose two-stage load pattern clustering. The first stage is performed per individual user and identifies the various typical daily power usage patterns (s)he exhibits. The second stage takes those typical user patterns as input to group users that are similar. To improve scalability, we use fast wavelet transformation (FWT) of the time series data, which reduces the dimensionality of the feature space where the clustering algorithm operates (i.e., from ${N}$ data points in the time domain to log ${N}$ ). Another qualitative benefit of FWT is that patterns that are identical in shape, but just differ in a (typically small) time shift still end up in the same cluster. Furthermore, we use ${g}$ -means instead of ${k}$ -means as the clustering algorithm. Our comprehensive set of experiments analyzes the impact of using FWT versus time-domain features, and ${g}$ - versus ${k}$ -means, to conclude that in terms of cluster quality metrics our system is comparable to state-of-the-art methods, while being more scalable (because of the dimensionality reduction).
Read full abstract