Abstract

A large proportion of lead compounds are derived from natural products. However, most natural products have not been fully tested for their targets. To help resolve this problem, a model using transfer learning was built to predict targets for natural products. The model was pre-trained on a processed ChEMBL dataset and then fine-tuned on a natural product dataset. Benefitting from transfer learning and the data balancing technique, the model achieved a highly promising area under the receiver operating characteristic curve (AUROC) score of 0.910, with limited task-related training samples. Since the embedding distribution difference is reduced, embedding space analysis demonstrates that the model’s outputs of natural products are reliable. Case studies have proved our model’s performance in drug datasets. The fine-tuned model can successfully output all the targets of 62 drugs. Compared with a previous study, our model achieved better results in terms of both AUROC validation and its success rate for obtaining active targets among the top ones. The target prediction model using transfer learning can be applied in the field of natural product-based drug discovery and has the potential to find more lead compounds or to assist researchers in drug repurposing.

Highlights

  • Natural products have long been an important source of drug discoveries

  • Medicinal chemists have benefited from the identification of natural products [5], but the lack of bioactivity data on natural products remains an obstacle in drug discovery and drug design

  • The hyperparameter optimization step of a deep learning model is crucial to its performance

Read more

Summary

Introduction

Natural products have long been an important source of drug discoveries. Among all the drugs approved since 1981, more than 60% have been related to natural products.These can include drugs that have natural product structures or leads derived from natural product scaffolds [1]. Natural products have long been an important source of drug discoveries. Among all the drugs approved since 1981, more than 60% have been related to natural products. These can include drugs that have natural product structures or leads derived from natural product scaffolds [1]. The difference between natural products and the molecules synthesized by chemists is obvious. The possession of higher molecular weights and bigger scaffolds is more common in natural products. The scaffolds of natural structures are products of evolution [2]. There is often a greater probability of finding molecules that inhibit a series of expected targets in natural products [3,4]. Medicinal chemists have benefited from the identification of natural products [5], but the lack of bioactivity data on natural products remains an obstacle in drug discovery and drug design

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call