Abstract

In earlier years, the Drug discovery process took years to identify and process a Drug. It takes a normal of 12 years for a Drug to travel from the research lab to the patient. With the introduction of Machine Learning in Drug discovery, the whole process turned out to be simple. The utilization of computational tools in the early stages of Drug development has expanded in recent decades. A computational procedure carried out in Drug discovery process is Virtual Screening (VS). VS are used to identify the compounds which can bind to a Drug target. The preliminary process before analyzing the bonding of ligand and drug protein target is the prediction of drug likeness of compounds. The main objective of this study is to predict Drug likeness properties of Drug compounds based on molecular descriptor information using Tree based ensembles. In this study, many classification algorithms are analyzed and the accuracy for the prediction of drug likeness is calculated. The study shows that accuracy of rotation forest outperforms the accuracy of other classification algorithms in the prediction of drug likeness of chemical compounds. The measured accuracies of the Rotation Forest, Random Forest, Support Vector Machines, KNN, Decision Tree and Naïve Bayes are 98%, 97%, 94.8%, 92.8%, 91.4%, 89.5% respectively.

Highlights

  • Drug discovery is the process of identifying potential drug for specified disease

  • We extended the study with more compounds nearly 600 and extended with Kernel based methods (SVM) and nearest neighbor methods

  • Our paper focuses on prediction of Drug likeness based on machine learning methods

Read more

Summary

Introduction

Drug discovery is the process of identifying potential drug for specified disease. The whole process takes many years. Random Forest [2] algorithm is an ensemble classifier based on the concept of bagging It is used for feature engineering, which implies recognizing most essential feature out of available features from training dataset. The current work is based on the study of molecular descriptors for Drug/ Non-Drug compounds extracted from medicinal plants. A comparison of different classification algorithms in the prediction of drug likeness of chemical compounds are carried out in this study. The study of SVM classifier in the classification of Drug and non-Drug compounds are detailed in paper [1].They used SVM with various Feature Selection approaches. SVM with Subset Selection outrun better than logistic regressions, which was used in their previous study The main drawback they specified is comparatively limited sample set. Since these limitations are there large data set are needed to ratify the conclusion in future works

Objectives
Methods
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call