Abstract
BackgroundLong non-coding RNAs (lncRNAs) have emerged as a class of factors that are important for regulating development and cancer. Computational prediction of lncRNAs from ultra-deep RNA sequencing has been successful in identifying candidate lncRNAs. However, the complexity of handling and integrating different types of genomics data poses significant challenges to experimental laboratories that lack extensive genomics expertise.ResultTo address this issue, we have developed lncRNA-screen, a comprehensive pipeline for computationally screening putative lncRNA transcripts over large multimodal datasets. The main objective of this work is to facilitate the computational discovery of lncRNA candidates to be further examined by functional experiments. lncRNA-screen provides a fully automated easy-to-run pipeline which performs data download, RNA-seq alignment, assembly, quality assessment, transcript filtration, novel lncRNA identification, coding potential estimation, expression level quantification, histone mark enrichment profile integration, differential expression analysis, annotation with other type of segmented data (CNVs, SNPs, Hi-C, etc.) and visualization. Importantly, lncRNA-screen generates an interactive report summarizing all interesting lncRNA features including genome browser snapshots and lncRNA-mRNA interactions based on Hi-C data.ConclusionlncRNA-screen provides a comprehensive solution for lncRNA discovery and an intuitive interactive report for identifying promising lncRNA candidates. lncRNA-screen is available as open-source software on GitHub.
Highlights
Long non-coding RNAs have emerged as a class of factors that are important for regulating development and cancer
Setup, execution and interactive browsing of the results Long non-coding RNAs (lncRNAs)-screen can be downloaded as a single zip file from GitHub through the link below
We developed lncRNA-screen, an easy-to-use integrative lncRNA discovery platform for comprehensive mapping and characterization of lncRNAs using a variety of genomics datasets
Summary
Long non-coding RNAs (lncRNAs) have emerged as a class of factors that are important for regulating development and cancer. The most basic definition of a lncRNA is a long RNA, at least 200 base pairs in length, that does not encode protein. LncRNA2Function [10], LNCipedia [11], lncRNAdb [12] and lncRNAtor [13] provide comprehensive databases for known, annotated lncRNAs. iSeeRNA [14], CPC [15] and CPAT [16] introduced machine learning-based approaches only focusing on assessment of the coding probability of potential lncRNAs. lncRScan [17] and its new version lncRScan-SVM [18] are pipelines which provide novel multi-exonic lncRNA only discovery from RNA sequencing (RNA-seq), lacking the ability to integrate
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.