Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

Karima Sid,Mohamed Batouche

doi:10.1504/ijdmmm.2021.10035119

Abstract

Virtual screening is one of the most common computer-aided drug design techniques that apply computational tools and methods on large libraries of molecules to extract the drugs. Ensemble learning is a recent paradigm launched to improve machine learning results in terms of predictive performance and robustness. It has been successfully applied in ligand-based virtual screening (LBVS) approaches. Applying ensemble learning on huge molecular libraries is computationally expensive. Hence, the distribution and parallelisation of the task have become a significant step by using sophisticated frameworks such as Apache Spark. In this paper, we propose a new approach HEnsL_DLBVS, for heterogeneous ensemble learning, distributed on Spark to improve the large-scale LBVS results. To handle the problem of imbalanced big training datasets, we propose a novel hybrid technique. We generate new training datasets to evaluate the approach. Experimental results confirm the effectiveness of our approach with satisfactory accuracy and its superiority over homogeneous models.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

Abstract

Talk to us

Similar Papers

More From: International Journal of Data Mining, Modelling and Management

Lead the way for us

Similar Papers

Maximal Unbiased Benchmarking Data Sets for Human Chemokine Receptors and Comparative Analysis.
Jie Xia ... Xiang Simon Wang
Journal of chemical information and modeling | VOL. 58
Jie Xia, et. al.Jie Xia ... Xiang Simon Wang
26 Apr 2018
Journal of chemical information and modeling | VOL. 58

Comparison of Ligand-Based and Receptor-Based Virtual Screening of HIV Entry Inhibitors for the CXCR4 and CCR5 Receptors Using 3D Ligand Shape Matching and Ligand−Receptor Docking
Violeta I Pérez-Nueno ... Obdulia Rabal
Journal of Chemical Information and Modeling | VOL. 48
Violeta I Pérez-Nueno, et. al.Violeta I Pérez-Nueno ... Obdulia Rabal
26 Feb 2008
Journal of Chemical Information and Modeling | VOL. 48

Discovery of novel mGluR1 antagonists: A multistep virtual screening approach based on an SVM model and a pharmacophore hypothesis significantly increases the hit rate and enrichment factor
Guo-Bo Li ... Lin-Li Li
Bioorganic & Medicinal Chemistry Letters | VOL. 21
Guo-Bo Li, et. al.Guo-Bo Li ... Lin-Li Li
25 Jan 2011
Bioorganic & Medicinal Chemistry Letters | VOL. 21

Ligandbased Virtual screening using Fuzzy Correlation Coefficient
Ali Ahmed ... Naomie Salim
International Journal of Computer Applications | VOL. 19
Ali Ahmed, et. al.Ali Ahmed ... Naomie Salim
30 Apr 2011
International Journal of Computer Applications | VOL. 19

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Distributed heterogeneous ensemble learning on Apache Spark for ligand-based virtual screening

Abstract

Talk to us

Similar Papers

More From: International Journal of Data Mining, Modelling and Management