AI-driven approaches are widely used in drug discovery, where candidate molecules are generated and tested on a target protein for binding affinity prediction. However, generating new compounds with desirable molecular properties such as Quantitative Estimate of Drug-likeness (QED) and Dopamine Receptor D2 activity (DRD2) while adhering to distinct chemical laws is challenging. To address these challenges, we proposed a graph-based deep learning framework to generate potential therapeutic drugs targeting the SARS-CoV-2 protein. Our proposed framework consists of two modules: a novel reinforcement learning (RL)-based graph generative module with knowledge graph (KG) and a graph early fusion approach (GEFA) for binding affinity prediction. The first module uses a gated graph neural network (GGNN) model under the RL environment for generating novel molecular compounds with desired properties and a custom-made KG for molecule screening. The second module uses GEFA to predict binding affinity scores between the generated compounds and target proteins. Experiments show how fine-tuning the GGNN model under the RL environment enhances the molecules with desired properties to generate valid and unique compounds using different scoring functions. Additionally, KG-based screening reduces the search space of generated candidate molecules by while retaining of promising binding molecules against SARS-CoV-2 protein, i.e., 3C-like protease (3CLpro). We achieved a binding affinity score of 8.185 from the top rank of generated compound. In addition, we compared top-ranked generated compounds to Indinavir on different parameters, including drug-likeness and medicinal chemistry, for qualitative analysis from a drug development perspective. The online version contains supplementary material available at 10.1007/s13721-023-00409-2.
Read full abstract