We introduce automatic clustering as a computationally efficient tool for classifying and interpreting trajectories from simulations of photo-excited dynamics. Trajectories are treated as time-series data, with the features for clustering selected by variance mapping of normalized data. The L2-norm and dynamic time warping are proposed as suitable similarity measures for calculating the distance matrices, and these are clustered using the unsupervised density-based DBSCAN algorithm. The silhouette coefficient and the number of trajectories classified as noise are used as quality measures for the clustering. The ability of clustering to provide rapid overview of large and complex trajectory data sets, and its utility for extracting chemical and physical insight, is demonstrated on trajectories corresponding to the photochemical ring-opening reaction of 1,3-cyclohexadiene, noting that the clustering can be used to generate reduced dimensionality representations in an unbiased manner.
Read full abstract