Pyrimidines have been shown as promising nontoxic corrosion inhibitors for carbon steel in acid media that can replace toxic chemicals currently in use. However, the discovery of this important corrosion inhibitor has mainly be conducted by expensive trial and error experimental approaches. However, recent studies indicates that the use of machine learning can help in the speedy discovery of novel corrosion inhibitor molecules with minimal cost. In the present work, machine learning algorithms were utilized to develop predictive models for fifty-four (54) pyrimidines derivatives whose experimentally determined inhibition efficiencies data as corrosion inhibitors for carbon steel in hydrochloric acid medium are available in the literature utilizing the partial least square regression (PLS) and the random forest (RF). Seven descriptors were selected by PLS and used to develop the linear model. The variable importance results using the PLS indicates that molecular mass, molecular volume, electrophilicity, electronegativity, energy of the lowest unoccupied molecular orbital, electron affinity, and the logarithm of the partition coefficient were the main factors that determine the inhibition efficiencies. RF was used to capture the nonlinear nature of the data and to accurately predict the inhibition efficiency. Rigorous internal and external validation were performed using the PLS and RF to further verify the robustness and predictive ability of the models. The random forest yielded the best results with the mean standard error (MSE) of 32.602 compared to the PLS with MSE of 64.641. Both models were subsequently used for the prediction of five (5) new pyrimidines with very high inhibition efficiency. The result of this work can provide reference information and theoretical guidance for designing and synthesizing new and effective pyrimidines corrosion inhibitors.
Read full abstract