Abstract Cancer early detection is one of the most critical areas of cancer research, as it offers the greatest potential for improving patient outcomes. The International Alliance for Cancer Early Detection (ACED) is a global partnership of world-leading cancer research institutions in the UK and US, established in 2019 to accelerate and revolutionize research in this field. ACED brings together the expertise of the Canary Center at Stanford University, the University of Cambridge, the Knight Cancer Institute at Oregon Health and Sciences University, University College London, and the University of Manchester, together with Cancer Research UK, with the goal of catalyzing new collaborations and research in the field of early detection. One of the major challenges to enabling international collaborations across different institutions is the ability to collect, share and discover datasets between researchers and institutions. To enable collaboration on innovative data science, fundamental functions include, 1) controlled data sharing mechanisms; 2) structured metadata to enable data exploration and data discovery; and 3) enabling easy computational access to that data. To advance researchers' collaboration, ACED began development of an Integrated Data Platform (IDP). The ACED-IDP is based on the Gen3 software platform developed by the University of Chicago's Center for Translational Data Science. Based on software systems originally developed for the NCI's Genomic Data Commons, Gen3 has been used in several other data projects including The Blood Profiling Atlas in Cancer (BloodPAC), Australian BioCommons and numerous other research data platforms. The unique nature of ACED required unique innovations to be made for the development of the IDP. Each of the member institutions within the alliance has different existing computer infrastructures, separate authentication platforms and heterogeneous data types forming the basis of their research. The IDP sought a cloud-ready strategy, while still being cognizant of the extreme costs associated with cloud egress fees that hamper researchers' ability to download data. Additions were made to the Gen3 platform to allow for hybrid cloud support, allowing on-premises as well as cloud object storage systems to be linked to the platform. This innovation permits each institution to share files using the mechanisms that they see fit. To unify the various datasets, the standard Gen3 schema was replaced with one derived from the Fast Healthcare Interoperability Resources (FHIR) standard. To support import from clinical data sets, tooling to enable import and export Observational Medical Outcomes Partnership (OMOP) data has been integrated. With this platform in place, we hope to advance international collaborations and accelerate early cancer detection research. Citation Format: Brian Walsh, Liam Beckman, JD Burchett, Matthew Peterkort, Jordan Lee, Michael Fitzsimons, Peter Vassilatos, Binam Bajracharya, Craig Barnes, Jawad Qureshi, Robert Grossman, Carrie Yakura, Yaozhi Lu, Sarah Burge, Daniel Kelberman, Erin Watson, Kyle Ellrott. An integrated data platform to support the international alliance for cancer early detection [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 3558.
Read full abstract