ALSBMF: Predicting lncRNA-Disease Associations by Alternating Least Squares Based on Matrix Factorization

Wen Zhu,Kaimei Huang,Fang-Xiang Wu,Bo Liao,Yuhua Yao,Xiaofang Xiao

doi:10.1109/access.2020.2970069

Abstract

In recent years, it has been increasingly clear that long non-coding RNAs (lncRNAs) are able to regulate their target genes at multi-levels, including transcriptional level, translational level, etc and play key regulatory roles in many important biological processes, such as cell differentiation, chromatin remodeling and more. Inferring potential lncRNA-disease associations is essential to reveal the secrets behind diseases, develop novel drugs, and optimize personalized treatments. However, biological experiments to validate lncRNA-disease associations are very time-consuming and costly. Thus, it is critical to develop effective computational models. In this study, we have proposed a method by alternating least squares based on matrix factorization to predict lncRNA-disease associations, referred to as ALSBMF. ALSBMF first decomposes the known lncRNA-disease correlation matrix into two characteristic matrices, then defines the optimization function using disease semantic similarity, lncRNA functional similarity and known lncRNA-disease associations and solves two optimal feature matrices by least squares method. The two optimal feature matrices are finally multiplied to reconstruct the scoring matrix, filling the missing values of the original matrix to predict lncRNA-disease associations. Compared to existing methods, ALSBMF has the same advantages as BPLLDA. It does not require negative samples and can predict associations related to novel lncRNAs or novel diseases. In addition, this study performs leave-one-out cross-validation (LOOCV) and five-fold cross-validation to evaluate the prediction performance of ALSBMF. The AUCs are 0.9501 and 0.9215, respectively, which are better than the existing methods. Furthermore colon cancer, kidney cancer, and liver cancer are selected as case studies. The predicted top three colon cancer, kidney cancer, and liver cancer-related lncRNAs were validated in the latest LncRNADisease database and related literature. In order to test the ability of ALSBMF to predict novel disease-associated lncRNAs and new lncRNA-associated diseases, all known associations of diseases and lncRNAs were eliminated, the predicted top five breast cancer, nasopharyngeal carcinoma cancer-related lncRNAs and top five H19, MALAT1 lncRNA-related cancers were validated in PubMed and dbSNP.

Highlights

Sequence analysis of the human genome identified only 20,000 coding sequences that can be translated into proteins, The associate editor coordinating the review of this manuscript and approving it for publication was Kin Fong Lei .and the number of these coding sequences accounted for less than 2% of all human genomes(Paul et al, 2004)
The optimal feature matrix is multiplied by two feature matrices to reconstruct the scoring matrix, filling the missing values of the original matrix to predict the long non-coding RNAs (lncRNAs)-disease associations
Biological experiments have been the primary method for identifying lncRNA-disease associations

Summary

INTRODUCTION

Sequence analysis of the human genome identified only 20,000 coding sequences that can be translated into proteins, The associate editor coordinating the review of this manuscript and approving it for publication was Kin Fong Lei. VOLUME 8, 2020 disadvantage of this method is that it requires information on negative samples, which is unknown in this field of study To solve this problem, Chen et al identified a candidate lncRNA-disease association by establishing a Laplacian regularized least squares method based on a semi-supervised learning framework(Chen and Yan, 2013). Sun et al(Sun et al, 2014) putted forward a computational mean called RWRlncD based on lncRNA-disease relation, lncRNA similarity and disease similarity This algorithm performs a restart random walk (RWR) in the functional similarity network of lncRNA to capture latent lncRNA-disease associations. GrwLDA has some drawbacks, such as how to choose the optimal parameters In both of the above parts, all computational models require a known lncRNA-disease association for the prediction. This method shows the excellent prediction performance in experimental results

MATERIALS AND METHODS

ALSBMF

RESULT

Findings

CONCLUSION

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: IEEE Access	Publication Date: Jan 1, 2020
Citations: 6	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ALSBMF: Predicting lncRNA-Disease Associations by Alternating Least Squares Based on Matrix Factorization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access

Lead the way for us

Similar Papers

Prediction of lncRNA-Disease Associations via Closest Node Weight Graphs of the Spatial Neighborhood Based on the Edge Attention Graph Convolutional Network.
Jianwei Li ... Mengfan Kong
Frontiers in genetics | VOL. 12
Jianwei Li, et. al.Jianwei Li ... Mengfan Kong
04 Jan 2022
Frontiers in genetics | VOL. 12

A random forest based computational model for predicting novel lncRNA-disease associations
Dengju Yao ... Xiaorong Zhan
BMC Bioinformatics | VOL. 21
Dengju Yao, et. al.Dengju Yao ... Xiaorong Zhan
27 Mar 2020
BMC Bioinformatics | VOL. 21

IDSSIM: an lncRNA functional similarity calculation model based on an improved disease semantic similarity method
Wenwen Fan ... Yan Sun
BMC Bioinformatics | VOL. 21
Wenwen Fan, et. al.Wenwen Fan ... Yan Sun
31 Jul 2020
BMC Bioinformatics | VOL. 21

KATZLDA: KATZ measure for the lncRNA-disease association prediction.
Xing Chen
Scientific Reports | VOL. 5
Xing ChenXing Chen
18 Nov 2015
Scientific Reports | VOL. 5

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ALSBMF: Predicting lncRNA-Disease Associations by Alternating Least Squares Based on Matrix Factorization

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: IEEE Access