Abstract

Drug-induced liver toxicity is one of the significant safety challenges for the patient’s health and the pharmaceutical industry. It causes termination of drug candidates in clinical trials and also the retractions of approved drugs from the market. Thus, it is essential to identify hepatotoxic compounds in the initial stages of drug development process. The purpose of this study is to construct quantitative structure activity relationship models using machine learning algorithms and systematical feature selection methods for molecular descriptor sets. The models were built from a large and diverse set of 1253 drug compounds and were validated internally with 10-fold cross-validation. In this study, we applied a variety of feature selection techniques to extract the optimal subset of descriptors as modeling features to improve the prediction performance. Experimental results suggested that the support vector machine-based classifier had achieved a better classification accuracy with reduced molecular descriptors. The final optimal model provides an accuracy of 0.811, a sensitivity of 0.840, a specificity of 0.783 and Mathew’s correlation coefficient of 0.623 with an internal validation set. Furthermore, this model outperformed the prior studies while evaluated in both the internal and external test sets. The utilization of distinct optimal molecular descriptors as modeling features produce an in silico model with a superior performance.

Highlights

  • The liver is an indispensable organ of the body due to its crucial contribution in metabolizing xenobiotics [1]

  • We mainly focused on the following machine learning algorithms to develop binary classification models, among several methods that have been applied in Quantitative Structure Activity Relationship (QSAR) modeling: Support Vector Machine (SVM), Multi-Layer Perceptron (MLP), Logistic Regression (LR), Random Forest (RF), XG Boosting (XGB), K-Nearest Neighbors (KNN), Naive Bayes (NB) and Decision Tree (DT) classifier [27,32,33,53,55,56]

  • We evaluated the performance of the SVM classifier built on the final selected optimal descriptors subset with other conventional supervised learning methods such as MultiLayer Perceptron (MLP), Logistic Regression (LR), Random Forest (RF), XG Boosting (XGB), K-Nearest Neighbor (KNN), Naive Bayes (NB) and Decision Tree (DT) classifier

Read more

Summary

Introduction

The liver is an indispensable organ of the body due to its crucial contribution in metabolizing xenobiotics [1]. Drug-induced liver toxicity is one of the primary reasons for drug failure in clinical cases and leads to termination of approved drugs from the market. Drugs, herbals and other dietary products are responsible for the uncertain adverse liver injury [2,3,4,5]. The idiosyncratic behavior of the drugs caused by the dose level prescribed and depends on the patient’s metabolic, genetic and immunological factors [6]. Due to the unpredictable adverse hepatic effects on patient’s health, drug-induced liver injury (DILI) risk assessment has become the most important concern for safe drug development [7,8,9,10]. It is required to concentrate more on identifying the potential hepatotoxic compounds in advance

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call