Development of New Methods Needs Proper Evaluation-Benchmarking Sets for Machine Learning Experiments for Class A GPCRs.

Damian Leśniak,Andrzej J Bojarski,Igor Sieradzki,Sabina Podlewska,Stanisław Jastrzębski,Jacek Tabor

doi:10.1021/acs.jcim.9b00689

Abstract

New computational approaches for virtual screening applications are constantly being developed. However, before a particular tool is used to search for new active compounds, its effectiveness in the type of task must be examined. In this study, we conducted a detailed analysis of various aspects of preparation of respective data sets for such an evaluation. We propose a protocol for fetching data from the ChEMBL database, examine various compound representations in terms of the possible bias resulting from the way they are generated, and define a new metric for comparing the structural similarity of compounds, which is in line with chemical intuition. The newly developed method is also used for the evaluation of various approaches for division of the data set into training and test set parts, which are also examined in detail in terms of being the source of possible results bias. Finally, machine learning methods are applied in cross-validation studies of data sets constructed within the paper, constituting benchmarks for the assessment of computational methods developed for virtual screening tasks. Additionally, analogous data sets for class A G protein-coupled receptors (100 targets with the highest number of records) were prepared. They are available at http://gmum.net/benchmarks/ , together with script enabling reproduction of all results available at https://github.com/lesniak43/ananas .

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Development of New Methods Needs Proper Evaluation-Benchmarking Sets for Machine Learning Experiments for Class A GPCRs.

Abstract

Talk to us

Similar Papers

More From: Journal of Chemical Information and Modeling

Lead the way for us

Journal: Journal of Chemical Information and Modeling	Publication Date: Oct 11, 2019
Citations: 7

Similar Papers

Machine learning in pain research.
Jörn Lötsch ... Alfred Ultsch
Pain | VOL. 159
Jörn Lötsch, et. al.Jörn Lötsch ... Alfred Ultsch
24 Nov 2017
Pain | VOL. 159

Machine Learning Applications in Orthopaedic Imaging.
Vincent M Wang ... Albert J Kozar
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28
Vincent M Wang, et. al.Vincent M Wang ... Albert J Kozar
15 May 2020
The Journal of the American Academy of Orthopaedic Surgeons | VOL. 28

Sensors support machine learning
-
Food Science and Technology | VOL. 33
--
01 Dec 2019
Food Science and Technology | VOL. 33

Learning What Makes Catalysts Good
Nongnuch Artrith
Matter | VOL. 3
Nongnuch ArtrithNongnuch Artrith
01 Oct 2020
Matter | VOL. 3

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Development of New Methods Needs Proper Evaluation-Benchmarking Sets for Machine Learning Experiments for Class A GPCRs.

Abstract

Talk to us

Similar Papers

More From: Journal of Chemical Information and Modeling