Abstract

Abstract Machine learning is increasingly used for the analysis of flow cytometric immunophenotyping data. Many analysis pipelines leverage unsupervised approaches such as dimensionality reduction or clustering, which may lead to populations characterized differently from expert definition. Here we show a new, supervised algorithm for population identification that faithfully replicates the expert manual gating strategy. Peripheral blood samples were drawn from 41 different donors, stained with a dry, unitized B cell panel and acquired on two cytometer units. A manual gating strategy was established, and an automatic model trained on a subset of samples (n=31 donors, split into training, validation and test set) from this study. The performance of the automatic model was evaluated on the 10 unused samples. The automatic identification of populations with differences in median abundance (lymphocytes, CD19+ B-cells and CD19+ CD10+ cells) showed a high level of correlation and lower variability compared to manual gating. The supervised approach presented here retains the gating hierarchy, in contrast to the unsupervised clustering performed by FlowSOM, which splits cells into exclusive metaclusters, leading to metacluster assignments significantly different from expert gating. By automating the identification of populations, assay reproducibility can be increased by removing inter-operator variability. We show that a supervised machine learning tool that supports user-defined panels and gating strategies can reproduce expert analysis, reduce variability of flow cytometry data analysis, and can provide better fit-for-purpose compared to an unsupervised clustering approach.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call