Abstract
During the recent development of machine-learning (ML) methods for organic synthesis, the value of "failed experiments" has increasingly been acknowledged. Accordingly, we have developed an exhaustive database comprising 300 entries of experimental data obtained by performing ruthenium-catalyzed hydrogenation reactions using 10 ketones as substrates and 30 phosphine ligands. After evaluating the predictive performance of ML models using the constructed database, we conducted a virtual screening of commercially available phosphine ligands. For the virtual screening, we utilized several models, such as histogram-based gradient boosting and Ridge regression, combined with the Mordred descriptors and MACCSKeys, respectively. The disclosed approach resulted in the identification of high-performance phosphine ligands, and the rationale behind the predictions in the virtual screening was analyzed using SHAP.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have