An error correction model is established by combining the experimental values and the calculated values obtained by the semi-empirical method AM1, and using the multiple stepwise regression (MSR) method on the training set to obtain five descriptors that have a large effect on the error term, and the linear equation of the correction model is derived. The test set is used to verify the reliability of the calibration model. In the training set, the new corrected regression model increased the square of the coefficient of determination (R2) from 0.688 to 0.966 and reduced the root mean square error (RMSE) from 98.563 to 27.249. In the test set, the R2 increased from 0.872 to 0.884, and the RMSE decreased from 43.337 to 37.736. It has good fitting ability and stability. According to the results of multiple stepwise regression, the main atomic structure that causes errors is 〉S==, which indicates that we should take the electrical environment of highly electronegative atoms into consideration, which can provide guidance for accurate calculation of thermodynamic properties.
Read full abstract