A formulation and an algorithm are presented to construct a truncated polynomial chaos representation of a vector-valued random output. This representation depends on a vector-valued random input with a known probability measure and a vector-valued random latent variable with an unknown probability measure. The construction of this PCE representation relies solely on a training set comprising a small number of independent realizations of the non-Gaussian dependent random output and input vectors. The training set consists of heterogeneous data, which poses challenges in accurately estimating the chaos coefficients. Despite the heterogeneity of the data, the proposed formulation and algorithm allow for the construction of a highly accurate global surrogate model. Additionally, we propose an alternative approach by constructing a surrogate model based on prior separation of the heterogeneous dataset into subsets, each containing “quasi-homogeneous” data. The separation method is designed to account for a partial overlap of the probability measure supports associated with the subsets. The identification of the PCE is performed offline. By utilizing the PCE, a fast online surrogate model is obtained, enabling analysis of large dynamical systems beyond the computational capabilities currently available. An application to atomic collisions of Helium on a graphite substrate is presented, where the training set was generated by Molecular Dynamics simulations done in a previous paper. The obtained results demonstrate accuracy of the proposed approach.
Read full abstract