Abstract

A key aim of post-genomic biomedical research is to systematically understand and model complex biomolecular activities based on a systematic perspective. Biomolecular interactions are widespread and interrelated, multiple biomolecules coordinate to sustain life activities, any disturbance of these complex connections can lead to abnormal of life activities or complex diseases. However, many existing researches usually only focus on individual intermolecular interactions. In this work, we revealed, constructed, and analyzed a large-scale molecular association network of multiple biomolecules in human by integrating associations among lncRNAs, miRNAs, proteins, drugs, and diseases, in which various associations are interconnected and any type of associations can be predicted. We propose Molecular Association Network (MAN)–High-Order Proximity preserved Embedding (HOPE), a novel network representation learning based method to fully exploit latent feature of biomolecules to accurately predict associations between molecules. More specifically, network representation learning algorithm HOPE was applied to learn behavior feature of nodes in the association network. Attribute features of nodes were also adopted. Then, a machine learning model CatBoost was trained to predict potential association between any nodes. The performance of our method was evaluated under five-fold cross validation. A case study to predict miRNA-disease associations was also conducted to verify the prediction capability. MAN-HOPE achieves high accuracy of 93.3% and area under the receiver operating characteristic curve of 0.9793. The experimental results demonstrate the novelty of our systematic understanding of the intermolecular associations, and enable systematic exploration of the landscape of molecular interactions that shape specialized cellular functions.

Highlights

  • One key issue in the systems biology and genomics research is how different biomolecules interact with another to bring about the appropriate cellular activities (Barabási and Oltvai, 2004)

  • The entire data set is randomly divided into five equal parts, each taking four subsets as the training set and the remaining one subset as the test set, cycle five times in turn, take the average of five times as the final performance

  • On entire Molecular Association Network (MAN), for predicting any type of molecular associations, that is, for predicting any link or edge in the association network, our method MAN-High-Order Proximity preserved Embedding (HOPE) achieves an average accuracy of 93.30%, a sensitivity of 91.50%, a specificity of 95.10%, a precision of 94.91%, a Matthews correlation coefficient (MCC) of 86.66%, an area under the ROC curve (AUC) of 97.93%, and an area under the precision-recall curve (AUPR) of 0.9761

Read more

Summary

Introduction

One key issue in the systems biology and genomics research is how different biomolecules interact with another to bring about the appropriate cellular activities (Barabási and Oltvai, 2004). When predicting the associations or interactions between two molecules, they generally only use the property information of the two kind of molecules themselves, such as the sequence or structure information of RNA or protein, the chemical structure information of the drug compound, and the semantic characterization information of the disease. This way, association patterns between nodes are lost.

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call