Abstract

The goal of medicinal chemistry is to improve on existing drug molecules or to create new ones for use in medicine. This is frequently accomplished by lead optimization, which entails creating similar but slightly modified versions of existing molecules. Generative models that use various representations of molecules, such as SMILES codes and molecular graphs, have been developed to aid in the search for hits in the unexplored chemical space. In this study, an autoencoder architecture was trained on chemical SMILES from the ChEMBL database to generate 157 analogues of Vandetanib by introducing noise to its latent representation. The distribution of the autoencoder's latent space was controlled by varying batch sizes during the reconstruction of chemical SMILES. Virtual screening and molecular dynamics simulations were conducted, and it was found that at least two analogues had a higher binding affinity than the control compound, demonstrating the potential of this approach for lead optimization. This architecture has a small number of parameters and has the potential to generate a wide variety of molecules. The model is implemented in Google Colaboratory notebook to be explored by scientific community via https://colab.research.google.com/drive/1BPhw7_-_VV11dbk6s9JGE0bSX0K_-qIh?usp=sharing

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call