Abstract
Molecular optimization, which transforms a given input molecule X into another Y with desired properties, is essential in molecular drug discovery. The traditional approaches either suffer from sample-inefficient learning or ignore information that can be captured with the supervised learning of optimized molecule pairs. In this study, we present a novel molecular optimization paradigm, Graph Polish. In this paradigm, with the guidance of the source and target molecule pairs of the desired properties, a heuristic optimization solution can be derived: given an input molecule, we first predict which atom can be viewed as the optimization center, and then the nearby regions are optimized around this center. We then propose an effective and efficient learning framework, Teacher and Student polish, to capture the dependencies in the optimization steps. A teacher component automatically identifies and annotates the optimization centers and the preservation, removal, and addition of some parts of the molecules; a student component learns these knowledges and applies them to a new molecule. The proposed paradigm can offer an intuitive interpretation for the molecular optimization result. Experiments with multiple optimization tasks are conducted on several benchmark datasets. The proposed approach achieves a significant advantage over the six state-of-the-art baseline methods. Also, extensive studies are conducted to validate the effectiveness, explainability, and time savings of the novel optimization paradigm.
Highlights
I NTRODUCING a new drug into the market takes over one billion USD and an average of 13 years [1], [2]
With the guidance of the source and target molecule pairs of the desired properties, we first predict which atom can be viewed as the optimization center, and the nearby regions are optimized around this center
We compare our approach with the following state-of-theart baselines, in which variational junction tree encoder–decoder (VJTNN), GVJTNN, and copy&refine strategy (CORE) as well as the proposed method require supervised molecular pairs, while Molecule Deep Q-Networks (MolDQN), GCPN, and junction tree variational autoencoder (JTVAE) have no need of such supervision
Summary
I NTRODUCING a new drug into the market takes over one billion USD and an average of 13 years [1], [2]. As shown, by differentiating between the source and target molecules we can derive a heuristic optimization solution: the appropriate substructures (the blue area) are first identified and preserved, and the surrounding context (the yellow area) is transformed In this way, we can leverage the information of source molecules to decrease the generation steps toward target molecules and guide the subsequent generation steps as prior knowledges. Inspired by the above observations, in this study, we present a novel molecular optimization paradigm, Graph Polish In this paradigm, with the guidance of the source and target molecule pairs of the desired properties, we first predict which atom can be viewed as the optimization center, and the nearby regions are optimized around this center. These optimization steps naturally offer a reference for researchers to understand the process of molecular optimization
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
More From: IEEE Transactions on Neural Networks and Learning Systems
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.