Abstract
Accumulating experimental studies have indicated the influence of lncRNAs on various critical biological processes as well as disease development and progression. Calculating lncRNA functional similarity is of high value in inferring lncRNA functions and identifying potential lncRNA-disease associations. However, little effort has been attempt to measure the functional similarity among lncRNAs on a large scale. In this study, we developed a Fuzzy Measure-based LNCRNA functional SIMilarity calculation model (FMLNCSIM) based on the assumption that functionally similar lncRNAs tend to be associated with similar diseases. The performance improvement of FMLNCSIM mainly comes from the combination of information content and the concept of fuzzy measure, which was applied to the directed acyclic graphs of disease MeSH descriptors. To evaluate the effectiveness of FMLNCSIM, we further combined it with the previously proposed model of Laplacian Regularized Least Squares for lncRNA-Disease Association (LRLSLDA). As a result, the integrated model, LRLSLDA-FMLNCSIM, achieve good performance in the frameworks of global LOOCV (AUCs of 0.8266 and 0.9338 based on LncRNADisease and MNDR database) and 5-fold cross validation (average AUCs of 0.7979 and 0.9237 based on LncRNADisease and MNDR database), which significantly improve the performance of previous classical models. It is anticipated that FMLNCSIM could be used for searching functionally similar lncRNAs and inferring lncRNA functions in the future researches.
Highlights
In recent years, the observation from the Generation Sequencing (NGS) project indicates that the number of non-coding sequences accounts for a large portion of the complete human genome
FMLNCSIM is a computational model for calculating the functional similarity of long noncoding RNAs (lncRNAs) by using the information of known lncRNA-disease associations and diseases directed acyclic graphs (DAGs) (See Figure 1 and 2)
When we explored the LncRNADisease database, Laplacian Regularized Least Squares for lncRNA-Disease Association (LRLSLDA)-FMLNCSIM achieved the best performance with area under ROC curve (AUC) of 0.7979+/-0.0098, significantly higher than those yielded by other methods (LRLSLDA: 0.7295+/-0.0089; LRLSLDA-LNCSIM1 0.7761+/-0.01; LRLSLDA-LNCSIM2 0.7872+/-0.0097)
Summary
The observation from the Generation Sequencing (NGS) project indicates that the number of non-coding sequences accounts for a large portion (more than 98%) of the complete human genome. A great number of non-coding RNAs (ncRNAs) are discovered which do not encode proteins, especially long noncoding RNAs (lncRNAs). Increasing evidences from biological experiments have shown that lncRNAs carry out various crucial functions, which clearly contradict to the traditional viewpoint. LncRNAs cover a wide range of functions of modulating gene expression at the epigenetic, transcriptional, and post-transcriptional levels [2]. LncRNAs get involved in diverse biological processes, such as chromatin modification, cell differentiation and proliferation, RNA progressing, and cellular apoptosis [7,8,9,10,11,12,13,14]. HOTAIR was verified as scaffold to bind histone modifiers, PRC2, and the LSD1 complex, carrying out functions of histone www.impactjournals.com/oncotarget modifications control and gene expression regulation [15]. UCA1 is discovered to regulate the expression of several genes which are involved in tumorigenesis and embryonic development [17]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.