Abstract

MotivationEther-a-go-go-related gene (hERG) channel blockade by small molecules is a big concern during drug development in the pharmaceutical industry. Blockade of hERG channels may cause prolonged QT intervals that potentially could lead to cardiotoxicity. Various in-silico techniques including deep learning models are widely used to screen out small molecules with potential hERG related toxicity. Most of the published deep learning methods utilize a single type of features which might restrict their performance. Methods based on more than one type of features such as DeepHIT struggle with the aggregation of extracted information. DeepHIT shows better performance when evaluated against one or two accuracy metrics such as negative predictive value (NPV) and sensitivity (SEN) but struggle when evaluated against others such as Matthew correlation coefficient (MCC), accuracy (ACC), positive predictive value (PPV) and specificity (SPE). Therefore, there is a need for a method that can efficiently aggregate information gathered from models based on different chemical representations and boost hERG toxicity prediction over a range of performance metrics.ResultsIn this paper, we propose a deep learning framework based on step-wise training to predict hERG channel blocking activity of small molecules. Our approach utilizes five individual deep learning base models with their respective base features and a separate neural network to combine the outputs of the five base models. By using three external independent test sets with potency activity of IC50 at a threshold of 10 upmum, our method achieves better performance for a combination of classification metrics. We also investigate the effective aggregation of chemical information extracted for robust hERG activity prediction. In summary, CardioTox net can serve as a robust tool for screening small molecules for hERG channel blockade in drug discovery pipelines and performs better than previously reported methods on a range of classification metrics.

Highlights

  • The human ether-à-go-go-related gene encodes a voltage-dependent ion channel (Kv11.1, hERG) involved in controlling the electrical activity of the heart by mediating the re-polarisation current in the cardiac actionKarim et al J Cheminform (2021) 13:60 for QT interval prolongation by non-cardiovascular medicinal products were decided at the International Conference on Harmonization of Technical Requirements for the Registration of Pharmaceuticals for Human Use (ICH) [4, 5]

  • Computational methods to predict hERG liability have been established and can help prioritise molecules during the early phase of drug development [4]. Most of these methods are based on either machine learning techniques, including random forest (RF), support vector machine (SVM), deep neural networks (DNN) and graph convolutional neural networks (GCN) or on structure based methods including pharmacophore searching, quantitative structure activity relationships (QSAR) and molecular docking [6,7,8,9,10]

  • We hypothesize that extraction of chemical information from all or the subsets of three levels of features and their variants can improve upon the performance over a wide range of accuracy metrics for molecular hERG activity prediction For this purpose, we propose a step-wise training based deep learning framework called CardioTox net, that improves upon the previously published best-in-class results in most of the performance metrics

Read more

Summary

Introduction

The human ether-à-go-go-related gene (hERG) encodes a voltage-dependent ion channel (Kv11.1, hERG) involved in controlling the electrical activity of the heart by mediating the re-polarisation current in the cardiac actionKarim et al J Cheminform (2021) 13:60 for QT interval prolongation by non-cardiovascular medicinal products were decided at the International Conference on Harmonization of Technical Requirements for the Registration of Pharmaceuticals for Human Use (ICH) [4, 5]. Computational methods to predict hERG liability have been established and can help prioritise molecules during the early phase of drug development [4] Most of these methods are based on either machine learning techniques, including random forest (RF), support vector machine (SVM), deep neural networks (DNN) and graph convolutional neural networks (GCN) or on structure based methods including pharmacophore searching, quantitative structure activity relationships (QSAR) and molecular docking [6,7,8,9,10]. Molecular graph representations have been used with graph convolutional neural networks [16] This intermediate level molecular graph representation offers a compromise between high level physicochemical features and low level SMILES and fingerprints [17]. Each molecule can be represented via a molecular graph which consists of node features and an adjacency matrix

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call