Abstract

Abstract The Cancer Genomics Cloud (CGC), powered by Seven Bridges, is an NCI-funded platform to make the analysis of cancer data faster, easier, and more accessible. As datasets grow larger, more diverse, and more complex, they have become increasingly challenging for many cancer researchers. The CGC provides a centralized resource for analysis by co-localizing three components within the same cloud platform: 1) large cancer datasets like The Cancer Genome Atlas (TCGA); 2) tools and workflows for analyzing public and private data; and 3) the computational capabilities to do large-scale analyses. In addition to the simplicity of data access; the CGC also contains >400 of best-practice workflows, the flexibility to bring private tools and notebooks, and the ability to complete interactive analyses, all with the speed of cloud computing. The number of available datasets and tools are rapidly increasing, connecting researchers to datasets from a wide range of sources and data types, such as proteomics and imaging. The platform has been continuously expanded to include new applications and features since its launch in 2016. Here we describe how new features have been implemented to broaden the data and workflows accessible to cancer researchers. First, we have integrated with several new nodes within the Cancer Research Data Commons to enable access and analysis on proteomics, canine, and other cancer datasets, which alongside with the genomics data on the CGC, enables true multi-omic analysis. Second, we have expanded our infrastructure to enable computation on both Google and Amazon cloud environments, allowing for analysis of data held in either environment. We have also added support for RStudio within the CGC, along with Python and Julia notebooks. Researchers can complete their entire workflow on the platform, from data discovery through visualizations, while streamlining collaboration and speeding the time from hypothesis to conclusion. Altogether, these added features enable a network of Findable, Accessible, Interoperable, and Reusable (FAIR) datasets towards making cancer data analysis faster, easier, and more accessible for all. Citation Format: Manisha Ray, Jack DiGiovanna, Jelena Radenkovic, Marko Tosic, Nikola Mirkovic, Boris Majic, Vladan Andjus, Brandi Davis-Dusenbery. The Cancer Genomics Cloud enables complex and multi-omic data science in the cloud [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 3223.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call