Abstract

As one large class of non-coding RNAs (ncRNAs), long ncRNAs (lncRNAs) have gained considerable attention in recent years. Mutations and dysfunction of lncRNAs have been implicated in human disorders. Many lncRNAs exert their effects through interactions with the corresponding RNA-binding proteins. Several computational approaches have been developed, but only few are able to perform the prediction of these interactions from a network-based point of view. Here, we introduce a computational method named lncRNA–protein bipartite network inference (LPBNI). LPBNI aims to identify potential lncRNA–interacting proteins, by making full use of the known lncRNA–protein interactions. Leave-one-out cross validation (LOOCV) test shows that LPBNI significantly outperforms other network-based methods, including random walk (RWR) and protein-based collaborative filtering (ProCF). Furthermore, a case study was performed to demonstrate the performance of LPBNI using real data in predicting potential lncRNA–interacting proteins.

Highlights

  • An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. b ORCID: 0000-0001-9910-8967. c ORCID: 0000-0002-5788-894X

  • protein-based collaborative filtering (ProCF) is based on the idea that if a protein interacts with an Long non-coding RNA (ncRNA) (lncRNAs), similar proteins will be recommended as interacting with this lncRNAP

  • The nodes that have only one link are not considered in the performance evaluation, so we further get 4796 lncRNA–protein interactions which match that condition, and this dataset is taken as ‘gold standard’ data in the Leave-one-out cross validation (LOOCV) test

Read more

Summary

Introduction

An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. b ORCID: 0000-0001-9910-8967. c ORCID: 0000-0002-5788-894X. An increasing number of studies show that approximately 2% of the whole mammalian genome represents proteincoding genes, whereas the majority of the genome consists a ORCID: 0000-0001-5185-7045. Peer review under responsibility of Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China. Long ncRNAs (lncRNAs), which consist of more than 200 nucleotides, constitute a large class of ncRNAs [6,7]. In the past several years, the number of identified lncRNAs has been increasing sharply because of the development of both bioinformatics tools and experimental techniques. Production and hosting by Elsevier B.V. on behalf of Beijing Institute of Genomics, Chinese Academy of Sciences and Genetics Society of China

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.