Abstract

In recent years, increasing associations between microRNAs (miRNAs) and human diseases have been identified. Based on accumulating biological data, many computational models for potential miRNA-disease associations inference have been developed, which saves time and expenditure on experimental studies, making great contributions to researching molecular mechanism of human diseases and developing new drugs for disease treatment. In this paper, we proposed a novel computational method named Ensemble of Decision Tree based MiRNA-Disease Association prediction (EDTMDA), which innovatively built a computational framework integrating ensemble learning and dimensionality reduction. For each miRNA-disease pair, the feature vector was extracted by calculating the statistical measures, graph theoretical measures, and matrix factorization results for the miRNA and disease, respectively. Then multiple base learnings were built to yield many decision trees (DTs) based on random selection of negative samples and miRNA/disease features. Particularly, Principal Components Analysis was applied to each base learning to reduce feature dimensionality and hence remove the noise or redundancy. Average strategy was adopted for these DTs to get final association scores between miRNAs and diseases. In model performance evaluation, EDTMDA showed AUC of 0.9309 in global leave-one-out cross validation (LOOCV) and AUC of 0.8524 in local LOOCV. Additionally, AUC of 0.9192+/-0.0009 in 5-fold cross validation proved the model’s reliability and stability. Furthermore, three types of case studies for four human diseases were implemented. As a result, 94% (Esophageal Neoplasms), 86% (Kidney Neoplasms), 96% (Breast Neoplasms) and 88% (Carcinoma Hepatocellular) of top 50 predicted miRNAs were confirmed by experimental evidences in literature.

Highlights

  • MicroRNAs are a kind of endogenous non-coding RNA with the length of about 22 nucleotides, regulating the expression of genes by base paring with target messenger RNA [1]

  • We carried out three types of case studies on important diseases, which were used to evaluate performance of model based on known associations in HMDD V2.0, for new diseases without known associations and based on known associations in HMDD V1.0

  • We believe that EDTMDA can make reliable predictions and guide experiments to uncover more miRNA-disease associations

Read more

Summary

Introduction

MicroRNAs (miRNAs) are a kind of endogenous non-coding RNA with the length of about 22 nucleotides, regulating the expression of genes by base paring with target messenger RNA (mRNA) [1]. The existing study has validated that the expression of mir-140 was reduced in osteoarthritic cartilage [8] Another example is that down-regulation of mir-145 was related to the increased expression of ERG, over-expression of which was the distinct characteristic of prostate cancer [9]. Increasing studies were devoted to developing computational models to predict potential miRNA-disease associations [13]. These computational models could infer miRNAs that were more likely to be related to the given disease. Based on the prediction results, biological experiments were preferentially conducted for those miRNAs to improve experimental efficiency and save time as well as expenditure

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call