Introduction Flow cytometry (FC) immunophenotyping is essential for accurate and prompt diagnosis of acute myeloid leukemia (AML), with up to 95% of AML patients showing detectable aberrant immunophenotype (IP) by FC. Current manual and subjective analysis of clinical FC data, however, lacks reproducibility and leads to interpretation variation. In previous studies, we showed that supervised machine learning (ML) approaches can detect IP abnormalities and identify the acute leukemia subclassification accurately at the specimen level. We also found that comparable classification performance can be achieved using only a subset of markers of the reagent panel (Ko et al., 2018; Monaghan et al., 2022). To interpret the ML classification results, we have developed data clustering and discriminative learning methods to identify and visualize the diagnostic cell populations in a 2D space for hematopathology review (Ji et al., 2020; Lee et al., 2018). Built upon these approaches, this study reports our findings on developing an interpretable cross-panel ML classification model to support timely and accurate diagnosis and subclassification of AML. Materials and Method The AML diagnosis model at the sample level utilized FC data obtained from 53 bone marrow (BM) aspirate samples from Roswell Park Comprehensive Cancer Center (RPCCC) and 50 BM aspirate samples from UPMC. Data from RPCCC were acquired using a Beckman Coulter Navios EX and measured with the ClearLLab10C panel; while data from UPMC were acquired using a BD FACSCantoII and measured with a UPMC-developed diagnostic panel. We used only the common parameters collected within these 103 FC data sets and applied a previously published Gaussian Mixture Model-Support Vector Machine (GMM-SVM) based approach to train a cross panel sample-level classification model for AML diagnosis (Ko et al., 2018; Monaghan et al., 2022). Three-fold cross-validation was conducted to assess the model performance, using sensitivity, specificity, classification accuracy, and area under the receiver operating characteristic curve (AUC). For the cell-level visualization and analysis, all samples in the training set were pooled together, before a uniform manifold approximation and projection (UMAP) process was applied to transform and visualize the cell populations within the samples. Specifically, 26 non-neoplastic and 27 AML samples from the M2 panel of the ClearLLab10C from the RPCCC cohort were combined for the UMAP visualization. Then the FlowSOM unsupervised clustering method and the DAFI semi-supervised clustering method (Lee et al., 2018) were applied to identify all of the cell populations that have significant differences between the AML samples and the non-neoplastic controls. Results The cross-panel AML versus non-neoplastic classifier, combining FC data from UPMC and RPCCC achieved exceptional performance. The AML detection model using all parameters on the ClearLLab10C panel achieved an AUC of 100%. When using only the overlapping parameters between the ClearLLab10C and the UPMC panels, the classification model achieved an AUC of 99.5%, while the cross-panel classifier achieved an AUC of 99.1% (Table 1). Specimen-level visualization confirmed the separation of AML versus non-neoplastic controls. The UMAP plots, created by pooling non-neoplastic and AML samples, exhibited distinct patterns for different AML subtypes (M1 to M5) and a genetic mutation (e.g., AML IDH1), which were distinct from control samples. Using the automated gating analysis, cell populations identified as different between AML and control samples were characterized and visualized in 2D-by-2D scatter plots for straightforward interpretation and assessment based on their immunophenotype. Conclusion Our study demonstrated the feasibility to develop and assess a ML classification model for AML detection and diagnosis using overlapping markers in FC data across reagent panels and sites. Advanced visualization and automated gating analysis enabled the interpretation of the ML classification by identifying and characterizing the diagnostic cell populations and their phenotypic heterogeneity, providing insights into AML heterogeneity at both the specimen and the single cell level. These findings highlight the potential of ML capability for improving patient care by timely and accurate diagnosis and subclassification of AML.
Read full abstract