Machine learning combined with solid solution strengthening model for predicting hardness of high entropy alloys

Yi-Fan Zhang,Wei-Li Wang,Liang Chang,Wei Ren,Nan Li,Qian Zhou,Shu-Jian Ding

doi:10.7498/aps.72.20230646

Abstract

Traditional material calculation methods, such as first principles and thermodynamic simulations, have accelerated the discovery of new materials. However, these methods are difficult to construct models flexibly according to various target properties. And they will consume many computational resources and the accuracy of their predictions is not so high. In the last decade, data-driven machine learning techniques have gradually been applied to materials science, which has accumulated a large quantity of theoretical and experimental data. Machine learning is able to dig out the hidden information from these data and help to predict the properties of materials. The data in this work are obtained from the published references. And several performance-oriented algorithms are selected to build a prediction model for the hardness of high entropy alloys. A high entropy alloy hardness dataset containing 19 candidate features is trained, tested, and evaluated by using an ensemble learning algorithm: a genetic algorithm is selected to filter the 19 candidate features to obtain an optimized feature set of 8 features; a two-stage feature selection approach is then combined with a traditional solid solution strengthening theory to optimize the features, three most representative feature parameters are chosen and then used to build a random forest model for hardness prediction. The prediction accuracy achieves an R2 value of 0.9416 by using the 10-fold cross-validation method. To better understand the prediction mechanism, solid solution strengthening theory of the alloy is used to explain the hardness difference. Further, the atomic size, electronegativity and modulus mismatch features are found to have very important effects on the solid solution strengthening of high entropy alloys when genetic algorithms are used for implementing the feature selection. The machine learning algorithm and features are further used for predicting solid solution strengthening properties, resulting in an R2 of 0.8811 by using the 10-fold cross-validation method. These screened-out parameters have good transferability for various high entropy alloy systems. In view of the poor interpretability of the random forest algorithm, the SHAP interpretable machine learning method is used to dig out the internal reasoning logic of established machine learning model and clarify the mechanism of the influence of each feature on hardness. Especially, the valence electron concentration is found to have the most significant weakening effect on the hardness of high entropy alloys.

Full Text