The paper demonstrates the results of modeling the proof strength in pipe steels improved by tempering heat treatment. The main types of models used in this study are described, and information about the pros and cons of different approaches to modeling the target variable is summarized. Empirical equations relating hardness to yield strength and tensile strength are given. The role of the parameter n in these equations is indicated. The reasons for choosing the applied set of independent variables in the models are explained. The distribution of the target variable in the data sample is shown, and information about the feature space used for each of the models is provided. A general description of the source data is given. The structure of the main data sample is studied by the DBSCAN clustering method and the t-SNE dimension reduction algorithm. The reason for splitting the sample into clusters is substantiated in the context of reducing the spread of the predicted value of proof strength. The effectiveness of splitting the sample is estimated by using the measure of the spread of n. Various regression models for predicting yield strength are compared. It is shown that the regression model based on gradient boosting over the decision trees (LightGBM) has the smallest prediction error among the models considered. The permutation significance of the features of the model with the smallest prediction error is determined, the calculated significance of the features being compared with that from the metallurgical theory. The validity of the obtained prediction models is evaluated in view of the significance of the features and the metric estimate used in this study. The hypothesis of using a proxy variable (n) obtained from theoretical calculations as a predictor in the yield strength prediction model is tested. It is demonstrated that the application of the grouping method together with the parameter n makes it possible to obtain satisfactory prediction results on a smaller feature space.
Read full abstract