Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening.

Zixuan Cang,Lin Mu,Guo-Wei Wei

doi:10.1371/journal.pcbi.1005929

Abstract

This work introduces a number of algebraic topology approaches, including multi-component persistent homology, multi-level persistent homology, and electrostatic persistence for the representation, characterization, and description of small molecules and biomolecular complexes. In contrast to the conventional persistent homology, multi-component persistent homology retains critical chemical and biological information during the topological simplification of biomolecular geometric complexity. Multi-level persistent homology enables a tailored topological description of inter- and/or intra-molecular interactions of interest. Electrostatic persistence incorporates partial charge information into topological invariants. These topological methods are paired with Wasserstein distance to characterize similarities between molecules and are further integrated with a variety of machine learning algorithms, including k-nearest neighbors, ensemble of trees, and deep convolutional neural networks, to manifest their descriptive and predictive powers for protein-ligand binding analysis and virtual screening of small molecules. Extensive numerical experiments involving 4,414 protein-ligand complexes from the PDBBind database and 128,374 ligand-target and decoy-target pairs in the DUD database are performed to test respectively the scoring power and the discriminatory power of the proposed topological learning strategies. It is demonstrated that the present topological learning outperforms other existing methods in protein-ligand binding affinity prediction and ligand-decoy discrimination.

Highlights

Machine learning has become one of the most important developments in data science and artificial intelligence
In terms of methodological development, we introduce advanced persistent homology approaches for the characterization of small molecular
The ultimate goal is to determine and predict whether a given drug candidate will bind to a target so as to activate or inhibit its function, which results in a therapeutic benefit to the patient

Summary

Introduction

Machine learning has become one of the most important developments in data science and artificial intelligence. Deep learning algorithms are able to automatically extract high-level features and discover intricate patterns in large data sets. One of the major advantages of machine learning algorithms is their ability to deal with large and diverse data sets and uncover complicated relationships. The success of deep learning has fueled the rapid growth in several areas of biological science [3, 5, 6], including bioactivity of small-molecule drugs [7,8,9,10] and genetics [11, 12], where large data sets are available

Objectives

Methods

Results

Discussion

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: PLoS computational biology	Publication Date: Jan 8, 2018
Citations: 212	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS computational biology

Lead the way for us

Similar Papers

Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction.
Zixuan Cang ... Guo‐Wei Wei
International Journal for Numerical Methods in Biomedical Engineering | VOL. 34
Zixuan Cang, et. al.Zixuan Cang ... Guo‐Wei Wei
16 Aug 2017
International Journal for Numerical Methods in Biomedical Engineering | VOL. 34

Predicting Affinity Through Homology (PATH): Interpretable Binding Affinity Prediction with Persistent Homology.
Yuxi Long ... Bruce R Donald
bioRxiv : the preprint server for biology | VOL. -
Yuxi Long, et. al.Yuxi Long ... Bruce R Donald
21 Oct 2024
bioRxiv : the preprint server for biology | VOL. -

A New Hybrid Neural Network Deep Learning Method for Protein-Ligand Binding Affinity Prediction and De Novo Drug Design.
Sarita Limbu ... Sivanesan Dakshanamurthy
International Journal of Molecular Sciences | VOL. 23
Sarita Limbu, et. al.Sarita Limbu ... Sivanesan Dakshanamurthy
11 Nov 2022
International Journal of Molecular Sciences | VOL. 23

DyScore: A Boosting Scoring Method with Dynamic Properties for Identifying True Binders and Nonbinders in Structure-Based Drug Discovery.
Yanjun Li ... Yaxia Yuan
Journal of Chemical Information and Modeling | VOL. 62
Yanjun Li, et. al.Yanjun Li ... Yaxia Yuan
03 Nov 2022
Journal of Chemical Information and Modeling | VOL. 62

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Representability of algebraic topology for biomolecules in machine learning based scoring and virtual screening.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: PLoS computational biology