Abstract
BackgroundThe multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. Here, we studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing.ResultsIn MCF-7 breast cancer cells, we find 2700 genes with interdependent alternative transcription initiation, splicing and polyadenylation events, both in proximal and distant parts of mRNA molecules, including examples of coupling between transcription start sites and polyadenylation sites. The analysis of three human primary tissues (brain, heart and liver) reveals similar patterns of interdependency between transcription initiation and mRNA processing events. We predict thousands of novel open reading frames from full-length mRNA sequences and obtained evidence for their translation by shotgun proteomics. The mapping database rescues 358 previously unassigned peptides and improves the assignment of others. By recognizing sample-specific amino-acid changes and novel splicing patterns, full-length mRNA sequencing improves proteogenomics analysis of MCF-7 cells.ConclusionsOur findings demonstrate that our understanding of transcriptome complexity is far from complete and provides a basis to reveal largely unresolved mechanisms that coordinate transcription initiation and mRNA processing.
Highlights
The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level
Detection and quantification of full-length transcripts in MCF-7 cells To investigate the genome-wide coupling of transcription initiation and messenger RNA (mRNA) processing, full-length mRNAs from MCF-7 human breast cancer cells were sequenced on 147 SMRT cells using the Iso-Seq method on the Pacific Biosciences RSII platform (Additional file 1: Table S1)
Transcript structures were defined by applying the isoform-level clustering algorithm (ICE) on full-length reads [15], capturing the entire mRNA molecule
Summary
The multifaceted control of gene expression requires tight coordination of regulatory mechanisms at transcriptional and post-transcriptional level. We studied the interdependence of transcription initiation, splicing and polyadenylation events on single mRNA molecules by full-length mRNA sequencing. Tight regulation and coordination of these processes ensures the production of a (limited) set of cell-, RNA sequencing (RNA-seq) has become a central technology for deciphering the global RNA expression patterns. Reconstruction and expression level estimation of alternative transcripts using standard RNA-seq experiments is limited and prone to error due to relatively short read length (typically up to 150 nt). It is apparent that singlemolecule long reads that capture the entire RNA molecule can offer a better understanding of the rich patterns of alternative transcription initiation and mRNA processing events and, the underlying biology
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.