Abstract

Although transcriptome alteration is an essential driver of carcinogenesis, the effects of chromosomal structural alterations on the cancer transcriptome are not yet fully understood. Short-read transcript sequencing has prevented researchers from directly exploring full-length transcripts, forcing them to focus on individual splice sites. Here, we develop a pipeline for Multi-Sample long-read Transcriptome Assembly (MuSTA), which enables construction of a transcriptome from long-read sequence data. Using the constructed transcriptome as a reference, we analyze RNA extracted from 22 clinical breast cancer specimens. We identify a comprehensive set of subtype-specific and differentially used isoforms, which extended our knowledge of isoform regulation to unannotated isoforms including a short form TNS3. We also find that the exon–intron structure of fusion transcripts depends on their genomic context, and we identify double-hop fusion transcripts that are transcribed from complex structural rearrangements. For example, a double-hop fusion results in aberrant expression of an endogenous retroviral gene, ERVFRD-1, which is normally expressed exclusively in placenta and is thought to protect fetus from maternal rejection; expression is elevated in several TCGA samples with ERVFRD-1 fusions. Our analyses provide direct evidence that full-length transcript sequencing of clinical samples can add to our understanding of cancer biology and genomics in general.

Highlights

  • Transcriptome alteration is an essential driver of carcinogenesis, the effects of chromosomal structural alterations on the cancer transcriptome are not yet fully understood

  • The transcriptome subsequently went through SQANTI27 filtering, and potential artifact transcripts were removed by a random forest algorithm

  • We obtained a number of uniquely associated full-length non-chimeric (FLNC) reads in each sample; PBcount serves as a complementary measure of isoform expression

Read more

Summary

Introduction

Transcriptome alteration is an essential driver of carcinogenesis, the effects of chromosomal structural alterations on the cancer transcriptome are not yet fully understood. Several groups recently conducted comprehensive studies of cancer-specific alternative splicing[2,8,9,10] and showed that RNA alteration affects cancer genes in a manner that complements DNA alteration[2] All of these studies depended on RNAseq technology, which produces relatively short reads and requires imputation to generate full-length transcripts. These analyses were limited to individual splice site abnormalities and could neither directly nor efficiently target consequent transcripts. We constructed a cohort-wide breast cancer transcriptome from directly sequenced transcripts and characterized its complexity and subtype-specific regulation; hundreds of thousands of the isoforms we identified were previously unannotated. Our findings show that transcript-targeted analyses can directly capture a catalog of cancer isoforms originating from complex structural alterations

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call