Abstract

Abstract Cancer researchers are increasingly conducting multi-omic research and performing integrative analyses on combinations of genomic, proteomic, transcriptomic, imaging, single cell and other data modalities. However, it is quite challenging for a researcher to effectively access, aggregate and analyze data and metadata from different data sources in a scalable and reproducible manner, as the individual datasets may be disconnected from each other, and have separate authentication and authorization requirements. We developed a workflow execution system in a data fabric called the Biomedical Research Hub (BRH) to overcome these challenges for academic researchers. The BRH is powered by the Gen3 technology, an open-source Kubernetes based software stack that allows cancer researchers to create their own data fabric and interoperate with data from multiple data sources. The workflow execution system utilizes nextflow and allows researchers to run containerized applications in the cloud in a secure and isolated environment. Data from multiple resources can be combined for analysis using convenient pay models including NIH STRIDES. We plan to demonstrate the application of our system on a scientific use case involving Clonal Hematopoiesis of Indeterminate potential (CHIP), a phenomenon that has been associated with aging, cancer, cardiovascular diseases, infection and all-cause mortality. We run containerized CHIP workflows on the cloud, utilizing two datasets accessible through BRH: i) The Genomic Data Commons, the world’s largest source of harmonized cancer data and ii) BioDataCatalyst, an NHLBI ecosystem that drives discovery and innovation for heart, lung, blood and sleep disorders. Our system is ideally suited for machine learning on large aggregated cancer datasets and federated learning tasks. Citation Format: Aarti Venkat, Pauline Ribeyre, Jawad Qureshi, Sai Shanmukha Narumanchi, J Montgomery Maxwell, Bill Winslow, Sara Volk Garcia, Chris Meyer, Tzintzuni Garcia, Peter Vassilatos, Clinton Malson, Zhenyu Zhang, Robert Grossman. A workflow execution system in a data fabric for integrative cancer analyses [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 6242.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.