Abstract
Measuring toxicity is one of the main steps in drug development. Hence, there is a high demand for computational models to predict the toxicity effects of the potential drugs. In this study, we used a dataset, which consists of four toxicity effects:mutagenic, tumorigenic, irritant and reproductive effects. The proposed model consists of three phases. In the first phase, rough set-based methods are used to select the most discriminative features for reducing the classification time and improving the classification performance. Due to the imbalanced class distribution, in the second phase, different sampling methods such as Random Under-Sampling, Random Over-Sampling and Synthetic Minority Oversampling Technique are used to solve the problem of imbalanced datasets. ITerative Sampling (ITS) method is proposed to avoid the limitations of those methods. ITS method has two steps. The first step (sampling step) iteratively modifies the prior distribution of the minority and majority classes. In the second step, a data cleaning method is used to remove the overlapping that is produced from the first step. In the third phase, Bagging classifier is used to classify an unknown drug into toxic or non-toxic. The experimental results proved that the proposed model performed well in classifying the unknown samples according to all toxic effects in the imbalanced datasets.
Highlights
IntroductionThere is a high demand for computational models to predict the toxicity effects of the potential drugs
Measuring toxicity is one of the main steps in drug development
The experimental results proved that the proposed model performed well in classifying the unknown samples according to all toxic effects in the imbalanced datasets
Summary
There is a high demand for computational models to predict the toxicity effects of the potential drugs. Measuring toxicity of the drugs’ components is one of these steps This step is very important as it is used to predict drug failures before any clinical trials. This step could save $100 million per one drug development in the US as reported in Food and Drug Administration (FDA)[2,3]. This reflects the importance of determining the toxicological effects as early as possible. Molecular Weight Absolute Weight cLogP (Octanol/Water, partition coefficient) cLogS (Aqueous solubility) H-Acceptors (Hydrogen bond Acceptor) H-Donors (Hydrogen bond donor) Total Surface Area Polar Surface Area
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.