Abstract Gene fusions or rearrangements are widely recognized as a significant player in driving tumorigenesis. Several key findings implicating fusion events have been reported in hematological malignancies (e.g. BCR-ABL in CML) and more recently in solid tumors (e.g. EML4-ALK in NSCLC). Next-generation sequencing, particularly RNA-seq, has rapidly led to discoveries of novel recurrent gene fusions in a wide variety of tumors (e.g. lung, breast, bladder, etc.). While the overall frequency of specific fusion events may be considered low, they have demonstrated significant value in guiding or developing treatment options and improving clinical outcomes. The advent of large-scale public sequencing efforts such as The Cancer Genome Atlas (TCGA) provide a tremendous opportunity in implementing creative analysis approaches and expanding the therapeutic opportunities of targeting gene fusions in cancer. In this study, we describe a comprehensive survey of fusion events of a select number of genes identified across various tumor types in TCGA. The computational demands of gene fusion discovery using RNA-seq data generated across thousands of tumors pose a serious challenge in pan-cancer analyses of clinically relevant gene fusions. To address this problem, we have developed an analytical pipeline to query genes of interest for 5′ or 3′ rearrangements that drastically lowers computational cost and increases throughput, compared to existing fusion analysis pipelines. This efficiency is primarily achieved by restricting the “alignment search space” of sequenced transcript reads to a fraction of the full read count, thereby enabling tangible gains in speed without sacrificing specificity or sensitivity. We applied this approach to search for fusion events involving well characterized genes in approximately 5700 paired-end RNA-seq tumor samples from 20 different cancer types sequenced by the TCGA project, completing the analysis in around 2 months on a modest hardware setup. Our results were validated against reported findings of prevalence in the respective cancer types (e.g. ETS-family gene fusions in prostate cancer and EML4-ALK fusions in lung cancer). As a striking result, we were able to identify several novel fusion partners for known fused oncogenes. Furthermore, the newly identified and also previously known fusion genes were discovered in novel tumor types, thus expanding the fusion landscape of well-known genes. For example, some of the 27 NTRK fusion genes found were observed in other indications than previously reported. In summary, we have successfully executed a valid functional and efficient analysis pipeline to reveal oncogenic rearrangements that play a key role in the initiation and progression of cancer. Furthermore, these events molecularly define clinical subsets of disease and as such, can guide personalized targeted therapies. Citation Format: Henrik Edgren, Kalle Ojala, Anja Ruusulehto, Gopi Ganji. Rapid pan-cancer identification of previously unidentified fusion genes to enable novel targeted therapeutics. [abstract]. In: Proceedings of the 106th Annual Meeting of the American Association for Cancer Research; 2015 Apr 18-22; Philadelphia, PA. Philadelphia (PA): AACR; Cancer Res 2015;75(15 Suppl):Abstract nr 4793. doi:10.1158/1538-7445.AM2015-4793
Read full abstract