Abstract

Esophageal squamous cell carcinoma is a leading cause of cancer death. Mapping the transcriptional landscapes such as isoforms, fusion transcripts, as well as long noncoding RNAs have played a central role to understand the regulating mechanism during malignant processes. However, canonical methods such as short-read RNA-seq are difficult to define the entire polyadenylated RNA molecules. Here, we combined single-molecule real-time sequencing with RNA-seq to generate high-quality long reads and to survey the transcriptional program in esophageal squamous cells. Compared with the recent annotations of human transcriptome (Ensembl 38 release 91), single-molecule real-time data identified many unannotated transcripts, novel isoforms of known genes and an expanding repository of long intergenic noncoding RNAs (lincRNAs). By integrating with annotation of lincRNA catalog, 1,521 esophageal-cancer-specific lincRNAs were defined from single-molecule real-time reads. Kyoto Encyclopedia of Genes and Genomes enrichment analysis indicated that these lincRNAs and their target genes are involved in a variety of cancer signaling pathways. Isoform usage analysis revealed the shifted alternative splicing patterns, which can be recaptured from clinical samples or supported by previous studies. Utilizing vigorous searching criteria, we also detected multiple transcript fusions, which are not documented in current gene fusion database or readily identified from RNA-seq reads. Two novel fusion transcripts were verified based on real-time PCR and Sanger sequencing. Overall, our long-read single-molecule sequencing largely expands current understanding of full-length transcriptome in esophageal cells and provides novel insights on the transcriptional diversity during oncogenic transformation.

Highlights

  • Esophageal squamous cell cancer (ESCC) is a serious malignancy with poor prognosis and mortality rate (Lin et al, 2013; Chen et al, 2016)

  • We chose these three ESCC patient-derived cell lines as they cover all three types of cell differentiation of primary tumor: KYSE140 is derived from a patient with moderately differentiated squamous cell carcinoma, KYSE510 is derived from a patient with well-differentiated squamous cell carcinoma, and TE5 is derived from a patient with poorly differentiated squamous cell carcinoma

  • The mean length of reads was within 2.3–3.2 kbp in esophageal cells, indicating good quality of the single-molecule real-time (SMRT) dataset (Table S1A)

Read more

Summary

Introduction

Esophageal squamous cell cancer (ESCC) is a serious malignancy with poor prognosis and mortality rate (Lin et al, 2013; Chen et al, 2016). In order to find the altered signaling pathways and novel functional transcripts such as long intergenic noncoding RNAs (lincRNAs), several short-read based transcriptome sequencing studies have been conducted during the past several years (Cancer Genome Atlas Research et al, 2017; Li et al, 2017). Typical RNA-seq captures a large number of contiguous short reads (about 100–250 bp) and reconstructs the transcripts by statistical modeling. PacBio single-molecule real-time (SMRT) platform is capable to sequence long circularconsensus sequence reads of several thousand base pairs and have a good opportunity to capture full-length transcripts de novo. Hybrid PacBio SMRT sequencing provides a powerful tool to survey the transcriptional landscape in cells (Sharon et al, 2013; Tilgner et al, 2014; Weirather et al, 2015)

Methods
Results
Discussion
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.