Abstract

BackgroundGene-fusion or chimeric transcripts have been implicated in the onset and progression of a variety of cancers. Massively parallel RNA sequencing (RNA-Seq) of the cellular transcriptome is a promising approach for the identification of chimeric transcripts of potential functional significance. We report here the development and use of an integrated computational pipeline for the de novo assembly and characterization of chimeric transcripts in 55 primary breast cancer and normal tissue samples.MethodsAn integrated computational pipeline was employed to screen the transcriptome of breast cancer and control tissues for high-quality RNA-sequencing reads. Reads were de novo assembled into contigs followed by reference genome mapping. Chimeric transcripts were detected, filtered and characterized using our R-SAP algorithm. The relative abundance of reads was used to estimate levels of gene expression.ResultsDe novo assembly allowed for the accurate detection of 1959 chimeric transcripts to nucleotide level resolution and facilitated detailed molecular characterization and quantitative analysis. A number of the chimeric transcripts are of potential functional significance including 79 novel fusion-protein transcripts and many chimeric transcripts with alterations in their un-translated leader regions. A number of chimeric transcripts in the cancer samples mapped to genomic regions devoid of any known genes. Several ‘pro-neoplastic’ fusions comprised of genes previously implicated in cancer are expressed at low levels in normal tissues but at high levels in cancer tissues.ConclusionsCollectively, our results underscore the utility of deep sequencing technologies and improved bioinformatics workflows to uncover novel and potentially significant chimeric transcripts in cancer and normal somatic tissues.

Highlights

  • Gene-fusion or chimeric transcripts have been implicated in the onset and progression of a variety of cancers

  • Massively parallel RNA sequencing (RNA-Seq) of the cellular transcriptome has emerged as a promising approach for the identification of previously uncharacterized fusion-gene or chimeric transcripts of potential functional significance [7, 11,12,13,14,15]

  • We found that a number of these fusion transcripts are of potential functional significance including novel fusion-proteins and chimeric transcripts with alterations in their untranslated leader regions (UTRs)

Read more

Summary

Introduction

Gene-fusion or chimeric transcripts have been implicated in the onset and progression of a variety of cancers. Gene-fusions are a prevalent class of genetic variants that have been implicated in the onset and progression of a variety of cancers [1, 2] These variants may be generated on the DNA level by genomic rearrangements (e.g., large deletions or insertions, inversions and/or chromosomal translocations [3]). Massively parallel RNA sequencing (RNA-Seq) of the cellular transcriptome has emerged as a promising approach for the identification of previously uncharacterized fusion-gene or chimeric transcripts of potential functional significance [7, 11,12,13,14,15]. For example, a recent RNA-Seq analysis of 24 primary breast cancer samples uncovered 15 subtype specific fusion-genes that may serve as useful biomarkers of drug sensitivities [16]. Analysis of 89 breast cancer and control samples identified several fusion transcripts involving MAST (microtubule associated serine-threonine) kinase and Notch-family genes that may be drivers of breast cancer onset and/or progression [17]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call