Abstract

Genomic studies are now being undertaken on thousands of samples requiring new computational tools that can rapidly analyze data to identify clinically important features. Inferring structural variations in cancer genomes from mate-paired reads is a combinatorially difficult problem. We introduce Fastbreak, a fast and scalable toolkit that enables the analysis and visualization of large amounts of data from projects such as The Cancer Genome Atlas.

Highlights

  • Genomic analysis of cancer and other genetic diseases is changing from the study of individuals to the study of large populations

  • This is exemplified by large scale projects such as The Cancer Genome Atlas (TCGA), a multi-institution consortium working to build a comprehensive compendium of genomic information that promises to reveal the molecular basis of cancer, and lead to new discoveries and therapies

  • An analysis of the genes disrupted across hundreds of ovarian cancer and glioblastoma samples (Figure 4) shows that the Fastbreak results can be used to distinguish between tumor and blood samples and, to a lesser extent, disease types and to identify strong similarities in the types of gene function and pathways that are disrupted by structural variation (Robustness of analysis and quality assurance of data)

Read more

Summary

Introduction

Genomic analysis of cancer and other genetic diseases is changing from the study of individuals to the study of large populations. The algorithm and associated tools are available as open source software at http://code.google.com/p/fastbreak/ and incorporates several features: Scalable rule-based approach: The system uses a set of rules designed to detect the signatures of SVs in a single pass over the data and accumulate this information in efficient, parallelizable data structures.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call