Abstract
This paper explores automated approaches for the analysis and categorization of turbulent flow data as a means of assessing the quality of a turbulence dataset used for constructing data-driven turbulence closures. Single-point statistics from several high-fidelity turbulent flow simulation datasets are differentiated into groups using a Gaussian mixture model clustering algorithm. Candidate features are proposed, and a feature selection algorithm is applied to the data in a sequential fashion, flow by flow, to identify a good feature set and an optimal number of clusters for each dataset. Clusters are first identified for plane channel flows, producing results that agree with existing theory and empirical observations. Further clusters are then identified in an incremental fashion for flow over a wavy-walled channel, flow over a bump in a channel, and flow past a square cylinder. Some clusters are closely identified with the anisotropy state of the turbulence, whereas others can be connected to physical phenomena, such as boundary-layer separation and free shear layers. Exemplar points from the clusters, or prototypes, are then identified using a prototype placement method. These exemplars effectively summarize the dataset using a greatly reduced collection of data points. The clusters and their prototypes are used to assess the quality of a training dataset constructed by simply pooling the four flows. We enumerate the dataset’s shortcomings and state the limits of generalizability of any data-driven closure trained on it.
Submitted Version
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have