Abstract

Abstract High parameter flow cytometry is a highly utilized tool in both research and clinical settings for accurate immunophenotyping in immunology, oncology, and many other disciplines. With increasing dimensionality, manual analysis of data becomes more complex and time consuming. Over the past decade, several automated methods have been developed for cell population clustering and dimensionality reduction of high parameter flow cytometry data which have sped up and simplified the discovery of cell populations not observed by manual gating strategies. However, the input and output of such tools are stochastic in nature, thus making their results difficult to reuse with novel samples such as diagnostic patient samples. Several groups have recently attempted to utilize machine learning algorithms to analyze clinical flow cytometry data. These studies focused on training machine learning models with a goal of predicting a specific diagnosis for each sample. This approach can be limiting when the diagnostic features are not perfectly met to predict a specific diagnosis as well as the absence of the patient's clinical data. To our knowledge, no studies have utilized machine learning to identify and classify individual cells based on their raw fluorescence signature to various cellular populations, whether normal or abnormal, to aid a diagnostician to interpret their significance in a clinical context. In this study, we utilized a data set of ~240 million individual cells from ~9,500 flow cytometry files from 2,300 patients to train a machine learning model to predict cell population classifications as annotated by expert hematopathologists. These annotations were implemented into an end-to-end machine learning based decision support system allowing for rapid, automated analysis of high dimensional flow cytometry data with an output aiding the diagnostician in the generation of a diagnosis. A clinical validation highlighted the robustness of this methodology with 100% sensitivity and 94% specificity in the identification of abnormal B-cell populations on a cohort of 1,500 patients. This decision support system also identifies abnormal T-cell and myeloid cell populations with 100% sensitivity. Overall, the decision support system saves significant time as it removes the need for manual gating and analysis, and aids the diagnostician in identifying discrete cell populations and offering descriptive information about any abnormal populations identified.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call