Abstract

Precision Oncology Software Environment Interoperable Data Ontologies Network (POSEIDON) was conceived at City of Hope (COH) as a cloud-based computational and data storage platform. The platform brings together patient data from the COH electronic medical records (EMR), research data from the laboratories and enables parallelized computation on multiprocessor virtual machines. In POSEIDON, well-established computational environments, such as R, co-exist with custom-written pipelines. Daratumumab - an anti-CD38 therapeutic antibody - is a drug used for maintenance treatment of multiple myeloma. Daratumumab is administered after the stem cell transplant. Despite of its effectiveness, disease progression inevitably occurs after some time. This leads to patients not responding to daratumumab any longer. A Daratumumab sensitivity research trial is being conducted at COH, seeking to elucidate how multiple myeloma gains resistance to Daratumumab. Many patients undergo 20-22 treatment cycles, while some progress early and the cycles are stopped. Samples of peripheral blood of the patients on the trial are collected after each cycle of Daratumumab administration. All blood specimens are analyzed with CyTOF flow cytometry to determine and quantify cell populations. A fraction of samples is also analyzed with single-cell RNAseq to determine gene expression dynamics per population of cells. Finally, the laboratory values of each patient are added to the body of data. Given the multivariate longitudinal nature of the study as well as the sheer volume of the obtained data, a robust well-structured informatics system is needed. Here we present application of the POSEIDON environment to our ongoing trial. As shown on the Figure, POSEIDON is established within a cloud (AWS, Azure), which provides practically unlimited space for storage. The storage is redundantly distributed across multiple cloud nodes such that the data loss is unlikely. Also, any number of virtual machines may be deployed to perform computation. Specifically, the structured data are stored in POSEIDON in the SQLite database, which is in turn updated every two hours from the SQL Server located at COH. The SQL server at COH is populated by the laboratory with processed flow cytometry data and patient's records. R notebook(s) within POSEIDON communicate with the SQLite database to extract, transform and analyze the trial's data. The investigators from the collaborating laboratories (i.e., Hematology, Mathematical Oncology) are interacting with the notebooks to perform the analyses and present the results. In addition, the raw data are stored in POSEIDON as well. Therefore, POSEIDON establishes a single source of truth paradigm, which presents a single version of data to all involved parties. The cost associated with POSEIDON is twofold: computation and storage. Specifically, in our case a virtual machine of 4 CPU cores, 16 GB RAM and 128 GB solid-state storage space consumed $0.23 per hour when running. The storage cost for 14.6 GB of the project's data is $0.28 / month. In conclusion, it is important to note that POSEIDON can be scaled up beyond COH and beyond hematology.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call