Abstract
Transcripts are expressed spatially and temporally and they are very complicated, precise and specific; however, most studies are focused on protein-coding related genes. Recently, massively parallel cDNA sequencing (RNA-seq) has emerged to be a new and promising tool for transcriptome research, and numbers of non-coding RNAs, especially lincRNAs, have been widely identified and well characterized as important regulators of diverse biological processes. In this study, we used ultra-deep RNA-seq data from 15 mouse tissues to study the diversity and dynamic of non-coding RNAs in mouse. Using our own criteria, we identified totally 16,249 non-coding genes (21,569 non-coding RNAs) in mouse. We annotated these non-coding RNAs by diverse properties and found non-coding RNAs are generally shorter, have fewer exons, express in lower level and are more strikingly tissue-specific compared with protein-coding genes. Moreover, these non-coding RNAs show significant enrichment with transcriptional initiation and elongation signals including histone modifications (H3K4me3, H3K27me3 and H3K36me3), RNAPII binding sites and CAGE tags. The gene set enrichment analysis (GSEA) result revealed several sets of lincRNAs associated with diverse biological processes such as immune effector process, muscle development and sexual reproduction. Taken together, this study provides a more comprehensive annotation of mouse non-coding RNAs and gives an opportunity for future functional and evolutionary study of mouse non-coding RNAs.
Highlights
The previous studies demonstrated that mammalian genomes are pervasively transcribed (Clark et al, 2011; The ENCODE Project Consortium, 2007)
After pre-processing of the raw data, all the high-quality data were aligned to the mouse genome by GSNAP and we got 1.86 billion mapped fragments (142.97 Gb, 52.46-fold genome coverage) (Table 1)
We constructed transcript models for each tissue by Cufflinks using the mappable fragments. We integrated these assembled transcript models in 15 tissues by Cuffmerge and got 75,749 loci and 44,420 loci remained after filtering loci completely overlapping with the genes of RefSeq, UCSC and Ensembl
Summary
The previous studies demonstrated that mammalian genomes are pervasively transcribed (Clark et al, 2011; The ENCODE Project Consortium, 2007). The mammalian genome transcribes into mRNAs, and gives rise to a large amount of non-coding RNAs (Carninci et al, 2005; Guttman et al, 2009; Sati et al, 2012; Zheng et al, 2007). No. non-coding RNAs have been identified and they play important roles in various processes including imprinting, X-inactivation, cell cycle and development processes especially in regulation of pluripotency (Brown et al, 1992; Dinger et al, 2008; Guttman et al, 2011; Hawkins and Morris, 2010; Heard and Disteche, 2006; Hu et al, 2012; Pauli et al, 2011, 2012; Yang and Kuroda, 2007) Y., et al Sci China Life Sci June (2016) Vol. No.6 non-coding RNAs have been identified and they play important roles in various processes including imprinting, X-inactivation, cell cycle and development processes especially in regulation of pluripotency (Brown et al, 1992; Dinger et al, 2008; Guttman et al, 2011; Hawkins and Morris, 2010; Heard and Disteche, 2006; Hu et al, 2012; Pauli et al, 2011, 2012; Yang and Kuroda, 2007)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.