SimBoost: a read-across approach for predicting drug\u2013target binding affinities using gradient boosting machines

Tong He,Fuqiang Ban,Martin Ester,Marten Heidemeyer,Artem Cherkasov

doi:10.1186/s13321-017-0209-z

Tong He, Fuqiang Ban + Show 3 more

Open Access

https://doi.org/10.1186/s13321-017-0209-z

Copy DOI

Abstract

Computational prediction of the interaction between drugs and targets is a standing challenge in the field of drug discovery. A number of rather accurate predictions were reported for various binary drug–target benchmark datasets. However, a notable drawback of a binary representation of interaction data is that missing endpoints for non-interacting drug–target pairs are not differentiated from inactive cases, and that predicted levels of activity depend on pre-defined binarization thresholds. In this paper, we present a method called SimBoost that predicts continuous (non-binary) values of binding affinities of compounds and proteins and thus incorporates the whole interaction spectrum from true negative to true positive interactions. Additionally, we propose a version of the method called SimBoostQuant which computes a prediction interval in order to assess the confidence of the predicted affinity, thus defining the Applicability Domain metrics explicitly. We evaluate SimBoost and SimBoostQuant on two established drug–target interaction benchmark datasets and one new dataset that we propose to use as a benchmark for read-across cheminformatics applications. We demonstrate that our methods outperform the previously reported models across the studied datasets.

Highlights

Finding a compound that selectively binds to a particular protein is a highly challenging and typically expensive procedure in the drug development process, where more than 90% of candidate compounds fail due to crossreactivity and/or toxicity issues
We introduce Matrix Factorization as it was used in the literature for binary drug–target interaction prediction and as it plays an important role in our proposed method
We propose a version of SimBoost, called SimBoostQuant, which computes the confidence of the prediction by using quantile regression to learn a prediction interval for a given drug–target pair as a measure of the confidence of the prediction

Summary

Introduction

Finding a compound that selectively binds to a particular protein is a highly challenging and typically expensive procedure in the drug development process, where more than 90% of candidate compounds fail due to crossreactivity and/or toxicity issues. It is an important topic in drug research to gain knowledge about the interaction of compounds and target proteins through computational methods Such in silico approaches are capable of speeding up the experimental wet lab work by systematically prioritizing the most potent compounds and help predicting their potential side effects. The datasets commonly used for the training and evaluation of such machine learning-based prediction methods are the Enzymes, Ion Channels, Nuclear Receptor, and G Protein-Coupled Receptor datasets [3] These datasets contain binary labels Y(i,j) = 1 if drug–target pair (di, tj) is known to interact (as shown by wet lab experiments) and Y(i,j) = 0 if either (di, tj) is known to not interact or if the interaction of (di, tj) is unknown. The datasets tend to be biased towards drugs and targets that are considered to be more important or easier

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Journal of Cheminformatics	Publication Date: Apr 18, 2017
Citations: 287	License type: open-access

R Discovery Prime

R Discovery Prime

SimBoost: a read-across approach for predicting drug\u2013target binding affinities using gradient boosting machines

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics

Lead the way for us

Similar Papers

Application of magnetic techniques in the field of drug discovery and biomedicine.
Zm Saiyed ... Sd Telang
BioMagnetic Research and Technology | VOL. 1
Zm Saiyed, et. al.Zm Saiyed ... Sd Telang
01 Jan 2003
BioMagnetic Research and Technology | VOL. 1

Innovative Mamba and graph transformer framework for superior protein-ligand affinity prediction
Kaitai Han ... Zhenghui Wang
Microchemical Journal | VOL. 206
Kaitai Han, et. al.Kaitai Han ... Zhenghui Wang
15 Aug 2024
Microchemical Journal | VOL. 206

Structure-Based Rational Design of a Toll-like Receptor 4 (TLR4) Decoy Receptor with High Binding Affinity for a Target Protein
Jieun Han ... Dongsup Kim
PLoS ONE | VOL. 7
Jieun Han, et. al.Jieun Han ... Dongsup Kim
17 Feb 2012
PLoS ONE | VOL. 7

SSR-DTA: Substructure-aware multi-layer graph neural networks for drug–target binding affinity prediction
Yuansheng Liu ... Xiangxiang Zeng
Artificial Intelligence In Medicine | VOL. 157
Yuansheng Liu, et. al.Yuansheng Liu ... Xiangxiang Zeng
17 Sep 2024
Artificial Intelligence In Medicine | VOL. 157

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SimBoost: a read-across approach for predicting drug\u2013target binding affinities using gradient boosting machines

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of Cheminformatics