Enhancing knowledge discovery from cancer genomics data with Galaxy.

Marco A Albuquerque,Paul C Boutros,Jasleen K Grewal,Martin Krzywinski,Bruno M Grande,Prasath Pararajalingam,Sohrab P Shah,Selin Jessa,Elie J Ritch,Ryan D Morin

doi:10.1093/gigascience/gix015

Abstract

The field of cancer genomics has demonstrated the power of massively parallel sequencing techniques to inform on the genes and specific alterations that drive tumor onset and progression. Although large comprehensive sequence data sets continue to be made increasingly available, data analysis remains an ongoing challenge, particularly for laboratories lacking dedicated resources and bioinformatics expertise. To address this, we have produced a collection of Galaxy tools that represent many popular algorithms for detecting somatic genetic alterations from cancer genome and exome data. We developed new methods for parallelization of these tools within Galaxy to accelerate runtime and have demonstrated their usability and summarized their runtimes on multiple cloud service providers. Some tools represent extensions or refinement of existing toolkits to yield visualizations suited to cohort-wide cancer genomic analysis. For example, we present Oncocircos and Oncoprintplus, which generate data-rich summaries of exome-derived somatic mutation. Workflows that integrate these to achieve data integration and visualizations are demonstrated on a cohort of 96 diffuse large B-cell lymphomas and enabled the discovery of multiple candidate lymphoma-related genes. Our toolkit is available from our GitHub repository as Galaxy tool and dependency definitions and has been deployed using virtualization on multiple platforms including Docker.

Highlights

An inherent problem in the application of genomics to understand the molecular aetiology of cancer is the multi-disciplinary skillset required for researchers to draw meaningful inferences from high-throughput biological data
Algorithms for handling high-throughput sequence data are steadily being added to Galaxy, there currently remains a lack of tools and workflows tailored to perform common tasks involved in analyzing cancer genome and exome sequence data
We demonstrate the utility of the Galaxy Cancer Genomics Toolkit by applying the included workflows to a large cohort of Diffuse large B-cell lymphoma (DLBCL) patients (n = 96) and through a combination of analytical and exploratory approaches leveraging multiple visualization tools implemented within the Toolkit, we uncover new candidate lymphoma-related genes and putative genetic features associated with each molecular subgroup

Summary

Introduction

An inherent problem in the application of genomics to understand the molecular aetiology of cancer is the multi-disciplinary skillset required for researchers to draw meaningful inferences from high-throughput biological data. Algorithms for handling high-throughput sequence data are steadily being added to Galaxy, there currently remains a lack of tools and workflows tailored to perform common tasks involved in analyzing cancer genome and exome sequence data.

Results

Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: GigaScience	Publication Date: Mar 9, 2017
Citations: 5	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Enhancing knowledge discovery from cancer genomics data with Galaxy.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience

Lead the way for us

Similar Papers

CGV: Cancer Genome Viewer, a web service for integrative cancer genome and pharmacogenomic data analysis.
Ji-Hye Choi ... Hui-Seon Choi
Bioinformatics (Oxford, England) | VOL. 38
Ji-Hye Choi, et. al.Ji-Hye Choi ... Hui-Seon Choi
20 Sep 2022
Bioinformatics (Oxford, England) | VOL. 38

Druggable drivers of lung cancer.
Shameem Fawdar ... Zoe C Edwards
Oncotarget | VOL. 4
Shameem Fawdar, et. al.Shameem Fawdar ... Zoe C Edwards
14 Aug 2013
Oncotarget | VOL. 4

Abstract 2607: The cBioPortal for Cancer Genomics: an open source platform for accessing and interpreting complex cancer genomics data in the era of precision medicine
Jianjiong Gao ...
Cancer Research | VOL. 77
Jianjiong Gao, et. al.Jianjiong Gao ...
01 Jul 2017
Cancer Research | VOL. 77

Abstract 923: The cBioPortal for Cancer Genomics: An intuitive open-source platform for exploration, analysis and visualization of cancer genomics data
Jianjiong Gao ...
Cancer Research | VOL. 78
Jianjiong Gao, et. al.Jianjiong Gao ...
01 Jul 2018
Cancer Research | VOL. 78

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Enhancing knowledge discovery from cancer genomics data with Galaxy.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: GigaScience