Abstract

A goal of cancer research is to reveal cell subsets linked to continuous clinical outcomes to generate new therapeutic and biomarker hypotheses. We introduce a machine learning algorithm, Risk Assessment Population IDentification (RAPID), that is unsupervised and automated, identifies phenotypically distinct cell populations, and determines whether these populations stratify patient survival. With a pilot mass cytometry dataset of 2 million cells from 28 glioblastomas, RAPID identified tumor cells whose abundance independently and continuously stratified patient survival. Statistical validation within the workflow included repeated runs of stochastic steps and cell subsampling. Biological validation used an orthogonal platform, immunohistochemistry, and a larger cohort of 73 glioblastoma patients to confirm the findings from the pilot cohort. RAPID was also validated to find known risk stratifying cells and features using published data from blood cancer. Thus, RAPID provides an automated, unsupervised approach for finding statistically and biologically significant cells using cytometry data from patient samples.

Highlights

  • Malignant cells in human tumors are remarkably diverse in their functional cell identities and this intra-tumor cellular heterogeneity is closely linked to patient outcomes [1, 2]

  • The output of Risk Assessment Population IDentification (RAPID), when using t-SNE and FlowSOM, is a PDF containing a colorcoded, 2D t-SNE plot depicting all FlowSOM clusters, a 2D t-SNE plot colored by clusters which were significantly associated with patient outcome, and Kaplan-Meier survival estimates of patients for each subset (Figure 1b)

  • When Glioblastoma Negative Prognostic (GNP) and Glioblastoma Positive Prognostic (GPP) were assessed simultaneously, abundance of GNP cells was the primary predictor of mortality (OS hazard ratio (HR)=1.06 [1.01-1.10], p=0.02), while abundance of GPP cells was the primary predictor of time to tumor progression (PFS HR =0.96 [0.93-1.00]; p=0.04)

Read more

Summary

Introduction

Malignant cells in human tumors are remarkably diverse in their functional cell identities and this intra-tumor cellular heterogeneity is closely linked to patient outcomes [1, 2]. Two new technologies were created in parallel: 1) a tailored set of 34 antibodies for single cell mass cytometry of glioblastoma focused on phospho-protein signaling effectors, stem cell proteins, and transcription factors critical to neural development, and 2) an unsupervised cell discovery workflow termed RAPID (Risk Assessment Population Identification). RAPID identifies prognostic cell subsets in glioblastoma disease The second round of data analysis used an equal number of each patient’s glioblastoma cells to create a single, common t-SNE map of glioblastoma cell phenotypes across all patients (N = 131,880 cells; 4,710 cells x 28 patients) Prior to creating this common map, mass cytometry standardization beads were used to remove batch effects and to set the variance stabilizing arcsinh scale transformation for each channel following field-standard protocols [11, 38, 41]. This indicated that, once revealed, GNP and GPP cell subsets were phenotypically cohesive in a traditional cell biological sense and could be reliably quantified by traditional approaches compatible with standard clinical flow cytometric profiling

Discussion
Findings
Materials and Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call