Abstract

BackgroundMalaria is a major healthcare problem worldwide resulting in an estimated 0.65 million deaths every year. It is caused by the members of the parasite genus Plasmodium. The current therapeutic options for malaria are limited to a few classes of molecules, and are fast shrinking due to the emergence of widespread resistance to drugs in the pathogen. The recent availability of high-throughput phenotypic screen datasets for antimalarial activity offers a possibility to create computational models for bioactivity based on chemical descriptors of molecules with potential to accelerate drug discovery for malaria.ResultsIn the present study, we have used high-throughput screen datasets for the discovery of apicoplast inhibitors of the malarial pathogen as assayed from the delayed death response. We employed machine learning approach and developed computational predictive models to predict the biological activity of new antimalarial compounds. The molecules were further evaluated for common substructures using a Maximum Common Substructure (MCS) based approach.ConclusionsWe created computational models using state-of-the-art machine learning algorithms. The models were evaluated based on multiple statistical criteria. We found Random Forest based approach provides for better accuracy as assessed from ROC curve analysis. We further evaluated the active molecules using a substructure based approach to identify common substructures enriched in the active set. We argue that the computational models generated could be effectively used to screen large molecular datasets to prioritize them for phenotypic screens, drastically reducing cost while improving the hit rate.

Highlights

  • Malaria is a major healthcare problem worldwide resulting in an estimated 0.65 million deaths every year

  • Misclassification cost was set for False Negatives and was incremented so as to stay around the upper limit of False Positives (i.e., 20%)

  • As expected, introducing cost for each of the classifier resulted in an increase in the number of True Positives and decrease in the number of False Negatives thereby increasing the robustness of the model

Read more

Summary

Introduction

Malaria is a major healthcare problem worldwide resulting in an estimated 0.65 million deaths every year It is caused by the members of the parasite genus Plasmodium. The recent availability of high-throughput phenotypic screen datasets for antimalarial activity offers a possibility to create computational models for bioactivity based on chemical descriptors of molecules with potential to accelerate drug discovery for malaria. The availability of chemical structure and bio-activity information in standardized forms provide immense opportunities for creating predictive computational models to understand the correlation between chemical properties and their activities and opens up the possibility to create predictive computational models for bio-activities [17,18] These predictive models make it possible to computationally screen large molecular datasets thereby offering a possibility to improve the hit-rate and thereby reduce the overall costs of drug discovery. We have previously successfully generated such predictive models for anti-tubercular molecules [19,20] and for small molecule modulators of miRNA [21]

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call