Abstract

Numerous experiments have demonstrated that long non-coding RNA (lncRNA) play an important role in various systems of the human body. LncRNA deletions or mutations can cause human disease. The prediction of lncRNA-disease associations is conducive to the diagnosis and prevention of complex diseases. As we all know, it is a time-consuming and expensive process to predict lncRNA-disease associations via biological experiments. However, the computation methods can effectively discover lncRNA-disease associations with less human and material resources. In this paper, we propose a neural network-based matrix factorization model to predict lncRNA-disease associations, which is called NeuMFLDA. NeuMFLDA first converts the one-hot encoding of disease or lncRNA into word vector via the embedding layer. Then combined with the memorization of the conventional matrix factorization and the generalization of the multi-layer perceptron, the lncRNA-disease associations can be predicted more accurately. In addition, as opposed to conventional pointwise loss function, a new pairwise loss function is proposed to update our model parameters. Our new loss function optimizes the model from the perspective of ranking priority, which is more in line with the solution to the lncRNA-disease associations prediction task. Experiments show that NeuMFLDA reaches average AUCs of 0.904 ± 0.003 and 0.918 ± 0.002 in the framework of 5-fold cross validation and Leave-one-out cross validation, which is superior to three the-stateof- art methods. In case studies, 9, 9 and 8 out of top-10 candidate lncRNAs are verified by recently published literatures for hepatocelluar carcinoma, kidney cancer and ovarian cancer, respectively. In short, NeuMFLDA is an effective tool for predicting lncRNA-disease associations.

Highlights

  • There are lots of non-coding RNAs that are not transcribed in the human genome, which are seen as noise of transcription

  • The Leave-one-out cross validation (LOOCV) is based on known long non-coding RNA (lncRNA)-disease associations

  • In the 5-fold cross-validation framework, known lncRNA-disease association items were randomly divided into five groups, and each disease was ensured to be divided into each group

Read more

Summary

INTRODUCTION

There are lots of non-coding RNAs (ncRNAs) that are not transcribed in the human genome, which are seen as noise of transcription. Chen proposed a semi-supervised learning method (LRLSLDA) in the Laplacian regularized least squares framework to identify potential disease-associated lncRNAs[21]. This is the first lncRNA-disease prediction model, which enables researchers to further understand the relationship between lncRNA and disease. Lu proposed a method (named SIMCLDA) for predicting potential lncRNA-disease associations based on inductive matrix completion[24]. They computed Gaussian interaction profile kernel of lncRNAs from known lncRNA-disease interactions and functional similarity of diseases based on disease-gene and gene-gene onotology associations. We propose a neural network-based matrix factorization method (NeuMFLDA) to predict the lncRNAdisease associations. NeuMFLDA is an effective model for identifying disease-associated lncRNAs

MATERIAL AND METHODS
DATABASE
NeuMFLDA
RESULTS
CONCLUSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call