Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples.

Ariful Azad,Alex Pothen,Bartek Rajwa

doi:10.3389/fonc.2016.00188

Abstract

We describe algorithms for discovering immunophenotypes from large collections of flow cytometry samples and using them to organize the samples into a hierarchy based on phenotypic similarity. The hierarchical organization is helpful for effective and robust cytometry data mining, including the creation of collections of cell populations’ characteristic of different classes of samples, robust classification, and anomaly detection. We summarize a set of samples belonging to a biological class or category with a statistically derived template for the class. Whereas individual samples are represented in terms of their cell populations (clusters), a template consists of generic meta-populations (a group of homogeneous cell populations obtained from the samples in a class) that describe key phenotypes shared among all those samples. We organize an FC data collection in a hierarchical data structure that supports the identification of immunophenotypes relevant to clinical diagnosis. A robust template-based classification scheme is also developed, but our primary focus is in the discovery of phenotypic signatures and inter-sample relationships in an FC data collection. This collective analysis approach is more efficient and robust since templates describe phenotypic signatures common to cell populations in several samples while ignoring noise and small sample-specific variations. We have applied the template-based scheme to analyze several datasets, including one representing a healthy immune system and one of acute myeloid leukemia (AML) samples. The last task is challenging due to the phenotypic heterogeneity of the several subtypes of AML. However, we identified thirteen immunophenotypes corresponding to subtypes of AML and were able to distinguish acute promyelocytic leukemia (APL) samples with the markers provided. Clinically, this is helpful since APL has a different treatment regimen from other subtypes of AML. Core algorithms used in our data analysis are available in the flowMatch package at www.bioconductor.org. It has been downloaded nearly 6,000 times since 2014.

Highlights

Feature selection is the problem of identifying a representative set of features from a large dataset to construct a classification model
Whereas individual samples are represented in terms of their cell populations, a template consists of generic meta-populations that describe key phenotypes shared among all those samples
We have described a set of algorithms for feature selection in a collection of flow cytometry samples by identifying immunophenotypes

Summary

INTRODUCTION

Feature selection is the problem of identifying a representative set of features from a large dataset to construct a classification model. Current fluorescence-based technology supports the measurements of up to twenty proteins simultaneously in each cell [6], whereas atomic mass cytometry systems such as CyTOF [7] can measure more than forty markers per cell When thousands of such high-dimensional samples are produced in an experiment, researchers have no other alternative but to automate the data analysis. We extend our prior work [24, 25] and that of other researchers by clearly defining steps in template-based data analysis and developing a generic framework for robust classification and immunophenotyping. For this purpose, we have developed a scoring function that accounts for the diversity of the myeloid cell populations in the various subtypes of AML.

STEPS IN ANALYZING FC DATA

Removing Unintended Cells

Data Transformation and Variance Stabilization

Cell Population Identification

Registering Cell Populations across Samples

Overview of the Mixed Edge Cover Algorithm

Creating Templates from a Collection of Samples

Overview of the Template Construction

Comparisons among Different Algorithms for Creating Templates

Sample Classification Based on Templates

Classification Score of a Sample in the AML Dataset

The Healthy Dataset

Preprocessing and Spectral Unmixing

Variance Stabilization

Building Class Templates

Comparison with Alternative Approaches

The AML Dataset

Cell Populations in Healthy and AML Samples

Healthy and AML Templates

Identifying Meta-Clusters

Impact of Each Tube in the Classification

Classifying Test Samples

Findings

CONCLUSION

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Frontiers in Oncology	Publication Date: Aug 31, 2016
Citations: 13	License type: cc-by

R Discovery Prime

R Discovery Prime

Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Oncology

Lead the way for us

Similar Papers

Acute leukaemia in children, adolescents and youngadults in California: trends and inequalities in earlydeath and survival during 1988-2011

-

13 Sep 2016
Acute leukaemia in children, adolescents and youngadults in California: trends and inequalities in earlydeath and survival during 1988-2011

CPSF6-RARG-positive acute myeloid leukaemia resembles acute promyelocytic leukaemia but is insensitive to retinoic acid and arsenic trioxide
Ji Li ... Rui-Juan Li
Pathology | VOL. 55
Ji Li, et. al.Ji Li ... Rui-Juan Li
20 Sep 2022
Pathology | VOL. 55

Survival and risk factors for mortality in pediatric patients with acute myeloid leukemia in a single reference center in low-middle-income country.
Mecneide Mendes Lins ... Maria Julia Gonçalves Mello
Annals of Hematology | VOL. 98
Mecneide Mendes Lins, et. al.Mecneide Mendes Lins ... Maria Julia Gonçalves Mello
26 Mar 2019
Annals of Hematology | VOL. 98

A retrospective study of absence of expression of HLA-DR in acute myeloid leukaemia
Arul Thalaivasal ... Elizabeth Tegg
Pathology | VOL. 45
Arul Thalaivasal, et. al.Arul Thalaivasal ... Elizabeth Tegg
01 Jan 2013
Pathology | VOL. 45

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Immunophenotype Discovery, Hierarchical Organization, and Template-Based Classification of Flow Cytometry Samples.

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Frontiers in Oncology