Abstract
Isoform sequencing (Iso-Seq) uses long-read technology to produce highly accurate full-length reads of mRNA transcripts. Visualization of individual mRNA molecules can reveal new details of transcript variation within understudied portions of mRNA, such as the 5' untranslated region (UTR). Differential 5' UTRs may contain motifs, upstream open reading frames (uORFs), and secondary structures that can serve to regulate translation or further indicate changes in promoter usage, where transcriptional control may impact protein expression levels. To begin to explore isoform variation during T-cell activation, we generated the first Iso-Seq reference transcriptome of activated human CD4 T cells. Within this dataset, we discovered many novel splice- and end-variant transcripts. Remarkably, one in every eight genes expressed in our dataset was found to have a notable proportion of transcripts with 5' UTR lengthened by over 100bp compared to the longest corresponding UTR within the Gencode dataset. Among these end-variant transcripts, two novel isoforms were identified for CXCR5, a chemokine receptor associated with T follicular helper cell (Tfh) function and differentiation. When investigated in a model cell system, these lengthened UTR conferred reduced transcript stability and, for one of these isoforms, short uORFs introduced by the added length altered protein expression kinetics. This study highlights instances in which current reference databases are incomplete relative to the information obtained by long-read sequencing of intact mRNA. Iso-Seq is thus a promising approach to better understanding the plasticity of promoter usage, alternative splicing, and UTR sequences that influence RNA stability and translation efficiency.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have