Abstract

The molecular generation task stands as a pivotal step in the domains of computational chemistry and drug discovery, aiming to computationally generate molecular structures for specific properties. In contrast to previous models that focused primarily on SMILES strings or molecular graphs, our model placed a special emphasis on the substructure information on molecules, enabling the model to learn richer chemical rules and structure features from fragments and chemical reaction information on molecules. To accomplish this, we fragmented the molecules to construct heterogeneous graph representations based on atom and fragment information. Then our model mapped the heterogeneous graph data into a latent vector space by using an encoder and employed a self-regressive generative model as a decoder for molecular generation. Additionally, we performed transfer learning on the model using a small set of ligand molecules known to be active against the target protein to generate molecules that bind better to the target protein. Experimental results demonstrate that our model is highly competitive with state-of-the-art models. It can generate valid and diverse molecules with favorable physicochemical properties and drug-likeness. Importantly, they produce novel molecules with high docking scores against the target proteins.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call