Abstract

Feature selection targets for selecting relevant and useful features, and is a vital challenge in turbulence modeling by machine learning methods. In this paper, a new posterior feature selection method based on validation dataset is proposed, which is an efficient and universal method for complex systems including turbulence. Different from the priori feature importance ranking of the filter method and the exhaustive search for feature subset of the wrapper method, the proposed method ranks the features according to the model performance on the validation dataset, and generates the feature subsets in the order of feature importance. Using the features from the proposed method, a black-box model is built by artificial neural network (ANN) to reproduce the behavior of Spalart-Allmaras (S-A) turbulence model for high Reynolds number (Re) airfoil flows in aeronautical engineering. The results show that compared with the model without feature selection, the generalization ability of the model after feature selection is significantly improved. To some extent, it is also demonstrated that although the feature importance can be reflected by the model parameters during the training process, artificial feature selection is still very necessary.

Highlights

  • In recent years, there has been a surge in research on the combination of machine learning and turbulence

  • The method selects the features by three steps: firstly, measure the numerical distribution characteristics of the training and validation datasets according to the mean extrapolation distance and eliminate the features with obvious numerical extrapolation; secondly, construct a model with all

  • Using the features from the proposed method, a turbulence surrogate model is constructed with artificial neural networks driven by the flow field data calculated by N-S solver coupled with S-A model

Read more

Summary

Introduction

There has been a surge in research on the combination of machine learning and turbulence These researches can be divided into two categories according to the model form: classifiers for identifying uncertainty regions and regression models for calculating turbulence-related variables. Researchers can use machine learning to recognize the uncertain regions calculated by the Reynolds-Averaged Navier-Stokes (RANS) turbulence model based on experimental or high-fidelity data [1, 2]. The uncertainty of turbulence calculation comes from the ensemble averaging operation of Navier-Stokes (N-S) equation, the functional form of a model, the representation of Reynolds stress and the empirical parameters of the model [3]. Machine learning is mainly used to improve or substitute the conventional RANS model and the subgrid model in large eddy simulation (LES). With the research deeper and deeper, there are still many problems to be solved, such as feature selection (FS) and insufficient generalization ability, etc. [14, 15]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call