Abstract

Most human genes generate multiple transcript isoforms. The differential expression of these isoforms can help specify cell types. Diverse transcript isoforms arise from the use of alternative transcription start sites, polyadenylation sites and splice sites; however, the relative contribution of these processes to isoform diversity in normal human physiology is unclear. To address this question, we investigated cell type-dependent differences in exon usage of over 18 000 protein-coding genes in 23 cell types from 798 samples of the Genotype-Tissue Expression Project. We found that about half of the expressed genes displayed tissue-dependent transcript isoforms. Alternative transcription start and termination sites, rather than alternative splicing, accounted for the majority of tissue-dependent exon usage. We confirmed the widespread tissue-dependent use of alternative transcription start sites in a second, independent dataset, Cap Analysis of Gene Expression data from the FANTOM consortium. Moreover, our results indicate that most tissue-dependent splicing involves untranslated exons and therefore may not increase proteome complexity. Thus, alternative transcription start and termination sites are the principal drivers of transcript isoform diversity across tissues, and may underlie the majority of cell type specific proteomes and functions.

Highlights

  • Alternative splicing, alternative promoter usage and alternative polyadenylation enable the generation of multiple transcript isoforms from a single gene [1,2,3]

  • Since the dataset does not contain each tissue for each individual, we identified subsets of data that could be analyzed as fully crossed designs

  • We found that the proportions among the five exon categories were different between exonic regions with tissue-dependent usage (TDU) arising from alternative splicing (TDU-AS), exonic regions with TDU but no evidence of alternative splicing (TDUNAS) and the background sets of exons (P-value < 2.2 · 10−16, ␹ 2-test; Figure 5A, Supplementary Figure S21 and Table S8)

Read more

Summary

Introduction

Alternative splicing, alternative promoter usage and alternative polyadenylation enable the generation of multiple transcript isoforms from a single gene [1,2,3]. At least 70% of genes have multiple polyadenylation sites, >50% of genes have alternative transcription start sites and most genes undergo alternative splicing [4,5,6,7] These molecular processes have the potential to substantially increase the repertoire of transcripts, proteins and functions encoded by mammalian genomes [8,9,10]. Large-scale proteomics surveys indicate that the abundance of isoforms with disrupted domains, if not zero, is generally below levels that can currently be detected with high confidence [22,23] This raises the possibility that the function of a large proportion of transcript isoforms, if any, is on the level of the RNA rather than the protein

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.