Abstract

BackgroundA large body of evidence shows that miRNA regulates the expression of its target genes at post-transcriptional level and the dysregulation of miRNA is related to many complex human diseases. Accurately discovering disease-related miRNAs is conductive to the exploring of the pathogenesis and treatment of diseases. However, because of the limitation of time-consuming and expensive experimental methods, predicting miRNA-disease associations by computational models has become a more economical and effective mean.ResultsInspired by the work of predecessors, we proposed an improved computational model based on random forest (RF) for identifying miRNA-disease associations (IRFMDA). First, the integrated similarity of diseases and the integrated similarity of miRNAs were calculated by combining the semantic similarity and Gaussian interaction profile kernel (GIPK) similarity of diseases, the functional similarity and GIPK similarity of miRNAs, respectively. Then, the integrated similarity of diseases and the integrated similarity of miRNAs were combined to represent each miRNA-disease relationship pair. Next, the miRNA-disease relationship pairs contained in the HMDD (v2.0) database were considered positive samples, and the randomly constructed miRNA-disease relationship pairs not included in HMDD (v2.0) were considered negative samples. Next, the feature selection based on the variable importance score of RF was performed to choose more useful features to represent samples to optimize the model’s ability of inferring miRNA-disease associations. Finally, a RF regression model was trained on reduced sample space to score the unknown miRNA-disease associations. The AUCs of IRFMDA under local leave-one-out cross-validation (LOOCV), global LOOCV and 5-fold cross-validation achieved 0.8728, 0.9398 and 0.9363, which were better than several excellent models for predicting miRNA-disease associations. Moreover, case studies on oesophageal cancer, lymphoma and lung cancer showed that 94 (oesophageal cancer), 98 (lymphoma) and 100 (lung cancer) of the top 100 disease-associated miRNAs predicted by IRFMDA were supported by the experimental data in the dbDEMC (v2.0) database.ConclusionsCross-validation and case studies demonstrated that IRFMDA is an excellent miRNA-disease association prediction model, and can provide guidance and help for experimental studies on the regulatory mechanism of miRNAs in complex human diseases in the future.

Highlights

  • A large body of evidence shows that miRNA regulates the expression of its target genes at posttranscriptional level and the dysregulation of miRNA is related to many complex human diseases

  • 30, 50, 79 and 98 of the top 30, 50, 80 and 100 miRNAs predicted by improved RF-based prediction model for miRNA-disease associations (IRFMDA), were validated by records in the dbDEMC (v2.0), respectively. These results indicated that the IRFMDA had a good ability to predict miRNA-disease associations

  • [47], we developed an IRFMDA model based on random forest (RF) to predict potential miRNA-disease associations

Read more

Summary

Introduction

A large body of evidence shows that miRNA regulates the expression of its target genes at posttranscriptional level and the dysregulation of miRNA is related to many complex human diseases. Discovering disease-related miRNAs is conductive to the exploring of the pathogenesis and treatment of diseases. Increasing evidence demonstrated that the abnormal regulation of miRNAs caused the occurrence and progress of many complex human diseases, including various cancers [9,10,11,12], cardiovascular diseases [13,14,15], and metabolic diseases [16,17,18], just to name a few. Tens of thousands of associations between diseases and miRNAs have been discovered and validated by various biological experiments. It is very important to discovery and validate more miRNA-disease associations for exploring the pathogenesis and treatment options of these diseases

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call