Inferring appropriate synthesis reaction (i.e., retrosynthesis) routes for newly designed molecules is vital. Recently, computational methods have produced promising single-step retrosynthesis predictions. However, template-based methods are limited by the known synthesis templates; template-free methods are weakly interpretable; and semi template-based methods are deficient with regard to utilizing the associations between chemical entities. To address these issues, this paper leverages the intra-associations between synthons, the inter-associations between synthons and leaving groups (LGs), and the intra-associations between LGs. It develops a multitask graph representation learning model for single-step retrosynthesis prediction (Retro-MTGR) to solve reaction centre deduction and LG identification simultaneously. A comparison with 16 state-of-the-art methods first demonstrates the superiority of Retro-MTGR. Then, its robustness and scalability and the contributions of its crucial components are validated. More importantly, it can determine whether a bond can be a reaction centre and what LGs are appropriate for a given synthon, respectively. The answers reflect underlying chemical synthesis rules, especially opposite electrical properties between chemical entities (e.g., reaction sites, synthons, and LGs). Finally, case studies demonstrate that the retrosynthesis routes inferred by Retro-MTGR are promising for single-step synthesis reactions. The code and data of this study are freely available at https://doi.org/10.5281/zenodo.14346324.
Read full abstract