Abstract

BackgroundFusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response. The advent of paired-end RNA sequencing enhances our ability to discover fusion genes. While there are available methods, routine analyses of large number of samples are still limited due to high computational demands.ResultsWe develop FuSeq, a fast and accurate method to discover fusion genes based on quasi-mapping to quickly map the reads, extract initial candidates from split reads and fusion equivalence classes of mapped reads, and finally apply multiple filters and statistical tests to get the final candidates. We apply FuSeq to four validated datasets: breast cancer, melanoma and glioma datasets, and one spike-in dataset. The results reveal high sensitivity and specificity in all datasets, and compare well against other methods such as FusionMap, TRUP, TopHat-Fusion, SOAPfuse and JAFFA. In terms of computational time, FuSeq is two-fold faster than FusionMap and orders of magnitude faster than the other methods.ConclusionsWith this advantage of less computational demands, FuSeq makes it practical to investigate fusion genes in large numbers of samples. FuSeq is implemented in C++ and R, and available at https://github.com/nghiavtr/FuSeqfor non-commercial uses.

Highlights

  • Fusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response

  • Fusion genes are reported in different ment is usually done by standard read alignment methods types of cancers such as breast cancer [3, 4], lung can- in RNA sequencing (RNA-seq), such as TopHat-Fusion [11], SnowShoe-FTD

  • We develop FuSeq, a novel fusion detection method utilizing a recent quasi-mapping method for alignment that is substantially faster than traditional alignment methods [26]

Read more

Summary

Introduction

Fusion genes are known to be drivers of many common cancers, so they are potential markers for diagnosis, prognosis or therapy response. One type of structural chromosome rear- transcripts using RNA-seq data, and their comparisons rangements, has been found to play important roles in are available in several recent publications [9, 10]. These carcinogenesis [1, 2]. It is closely associated with an methods use various approaches, but generally include increase of chimeric proteins, with cancer risk and with three main steps: (i) read alignment, (ii) fusion candidate tumor phenotypes, all of which have potentials for clini- detection and (iii) false positive elimination.

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call