Abstract
A model-based gating strategy is developed for sorting cells and analyzing populations of single cells. The strategy, named CCAST, for Clustering, Classification and Sorting Tree, identifies a gating strategy for isolating homogeneous subpopulations from a heterogeneous population of single cells using a data-derived decision tree representation that can be applied to cell sorting. Because CCAST does not rely on expert knowledge, it removes human bias and variability when determining the gating strategy. It combines any clustering algorithm with silhouette measures to identify underlying homogeneous subpopulations, then applies recursive partitioning techniques to generate a decision tree that defines the gating strategy. CCAST produces an optimal strategy for cell sorting by automating the selection of gating markers, the corresponding gating thresholds and gating sequence; all of these parameters are typically manually defined. Even though CCAST is optimized for cell sorting, it can be applied for the identification and analysis of homogeneous subpopulations among heterogeneous single cell data. We apply CCAST on single cell data from both breast cancer cell lines and normal human bone marrow. On the SUM159 breast cancer cell line data, CCAST indicates at least five distinct cell states based on two surface markers (CD24 and EPCAM) and provides a gating sorting strategy that produces more homogeneous subpopulations than previously reported. When applied to normal bone marrow data, CCAST reveals an efficient strategy for gating T-cells without prior knowledge of the major T-cell subtypes and the markers that best define them. On the normal bone marrow data, CCAST also reveals two major mature B-cell subtypes, namely CD123+ and CD123- cells, which were not revealed by manual gating but show distinct intracellular signaling responses. More generally, the CCAST framework could be used on other biological and non-biological high dimensional data types that are mixtures of unknown homogeneous subpopulations.
Highlights
Understanding cancer heterogeneity is increasingly being regarded as critical in understanding cancer progression and overcoming therapeutic resistance [1,2,3,4]
This study proposes a datadriven gating strategy, CCAST, for sorting out homogeneous subpopulations from a heterogeneous population of single cells without relying on expert knowledge thereby removing human bias and variability
CCAST is optimized for cell sorting but can be applied to the identification and analysis of homogeneous subpopulations
Summary
Understanding cancer heterogeneity is increasingly being regarded as critical in understanding cancer progression and overcoming therapeutic resistance [1,2,3,4]. Technological challenges have limited our ability to fully characterize intra-tumor heterogeneity, in recent years characterizing heterogeneous populations of cells at the single-cell level using multidimensional fluorescence and mass flow cytometric data, combined with novel computational tools, has greatly improved our understanding of the extent of cellular heterogeneity [8,9]. Gating on a fluorescence-activated cell sorting (FACS) machine commonly refers to a manual process, performed by sequentially selecting regions from bivariate graphs that depict the expression of two markers at a time across all the cells. We make a distinction between gating algorithms that are optimized for sorting single cells versus analyzing a heterogeneous population of single cell data. Even though our gating strategy is optimized for cell sorting, it has value when used in analysis of population data at the single cell level
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.