Nucleophilic index (NNu) as a significant parameter plays a crucial role in screening of amine catalysts. Indeed, the quantity and variety of amines are extensive. However, only limited amines exhibit an NNu value exceeding 4.0 eV, rendering them potential nucleophiles in chemical reactions. To address this issue, we proposed a computational method to quickly identify amines with high NNu values by using Machine Learning (ML) and high-throughput Density Functional Theory (DFT) calculations. Our approach commenced by training ML models and the exploration of Molecular Fingerprint methods as well as the development of quantitative structure-activity relationship (QSAR) models for the well-known amines based on NNu values derived from DFT calculations. Utilizing explainable Shapley Additive Explanation plots, we were able to determine the five critical substructures that significantly impact the NNu values of amine. The aforementioned conclusion can be applied to produce and cultivate 4920 novel hypothetical amines with high NNu values. The QSAR models were employed to predict the NNu values of 259 well-known and 4920 hypothetical amines, resulting in the identification of five novel hypothetical amines with exceptional NNu values (>4.55 eV). The enhanced NNu values of these novel amines were validated by DFT calculations. One novel hypothetical amine, H1, exhibits an unprecedentedly high NNu value of 5.36 eV, surpassing the maximum value (5.35 eV) observed in well-established amines. Our research strategy efficiently accelerates the discovery of the high nucleophilicity of amines using ML predictions, as well as the DFT calculations.
Read full abstract