Abstract
As one large class of non-coding RNAs (ncRNAs), long ncRNAs (lncRNAs) have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI). LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV) test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR) and protein-based collaborative filtering (ProCF). Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.
Highlights
An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. b ORCID: 0000-0001-9910-8967. c ORCID: 0000-0002-5788-894X
protein-based collaborative filtering (ProCF) is based on the idea that if a protein interacts with an Long non-coding RNA (ncRNA) (lncRNAs), similar proteins will be recommended as interacting with this lncRNAP
The nodes that have only one link are not considered in the performance evaluation, so we further get 4796 lncRNA–protein interactions which match that condition, and this dataset is taken as ‘gold standard’ data in the Leave-one-out cross validation (LOOCV) test
Summary
An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. b ORCID: 0000-0001-9910-8967. c ORCID: 0000-0002-5788-894X. An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. Peer review under responsibility of Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China. Long ncRNAs (lncRNAs), which consist of more than 200 nucleotides, constitute a large class of ncRNAs [6,7]. In the past several years, the number of identified lncRNAs has been increasing sharply because of the development of both bioinformatics tools and experimental techniques. Production and hosting by Elsevier B.V. on behalf of Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.