In recent years, more and more studies have shown that microRNAs (miRNAs) play a key role in many important biological processes. Dysregulation of miRNAs can lead to a variety of diseases like cancers, thus predicting potential miRNA-disease associations is important for understanding drug development and disease pathogenesis, diagnosis and treatment. It is known that experimental methods to validate miRNA-disease associations typically involve miRNA knockout or knockdown, which is time and labor-intensive. As a result, computational models have been developed to predict unknown miRNA-disease associations from available information related to miRNAs, diseases, genes, and so on. However, their performances are yet to be improved. Noticing that appropriately combining multiple data-source is usually helpful for improving prediction accuracy, we have developed IMDAILM: Inferring miRNA-Disease Association by integrating lncRNA and miRNA data, a low-rank matrix completion model integrating miRNA, long noncoding RNA (lncRNA) and disease information to predict miRNA-disease associations. Specifically, the miRNA-disease association network and the lncRNA-disease association network are fused to form a new heterogeneous network consisting of 3 types of nodes representing miRNAs, lncRNAs and diseases. In addition, a negative sample inference method was proposed to infer unrelated miRNA-disease pairs. Based on both heterogeneous network and negative samples, a low-rank matrix completion model is proposed and solved. In practice, IMDAILM achieved an area under the curve (AUC) of 0.8884 for predicting miRNAs associated with diseases under the 5-fold cross-validation (CV), outperforming a few recent methods. IMDAILM also yielded an AUC of 0.8870 for predicting both lncRNAs and miRNAs associated with diseases. In addition, the 5-fold CV results indicate that IMDAILM is also superior to other methods in predicting miRNAs associated with isolated diseases. Finally, we confirmed a few novel predicted miRNAs associated with specific diseases like lung cancers by literature mining. In summary, the integration of lncRNA information into a matrix completion framework contributes to the prediction of miRNA-disease associations.
Read full abstract