Background: Acute myeloid leukemia (AML) comprises diverse genomic subgroups and remains hard to treat in most patients. Despite breakthroughs in the therapeutic arsenal in recent years, clinical usage of therapeutic antibodies or chimeric antigen receptor T (CAR-T) cells has been lagging in contrast to other hematological malignancies. In fact, CD33 represents the only antibody-based strategy approved for this disease to date, highlighting the need to identify new promising targets. AML cells span a wide range of aberrant myeloid differentiation programs, complexifying the identification, by bulk genomics, of targets expressed in the most immature leukemic cells. Aims and Methods: To identify the expression landscape of surface proteins in immature leukemic cells, we performed single-cell RNA sequencing (scRNA-seq, 10x 3' Reagent Kits) of primary human AML cells from 20 specimens of the Leucegene cohort enriched in intermediate and adverse genetic backgrounds ( KMT2A-rearranged n=5, chromosome 5 and/or 7 deletions (abn5/7, n=5) complex karyotype (n=4), NPM1/DNMT3A/FLT3-ITD triple-mutant (n=3) and others (n=3)). A Random Forest classifier was developed to unbiasedly classify AML cells into distinct differentiation stages using normal bone marrow-derived scRNA-seq data from the Human Cell Atlas (HCA) consortium. Genes were scored based on their probability of coding for proteins expressed at the cell surface using the SPAT algorithm developed by our group (https://doi.org/10.1101/2023.07.07.547075), retaining high score ones. To validate surface expression, we concomitantly analyzed the surface proteome (hereafter named surfaceome) of 100 primary human AML samples from the Leucegene cohort, including all 20 samples profiled by scRNA-seq. Results: After quality control, we profiled and characterized 103 690 high quality cells (mean of 5185 cells/sample). We trained a Random Forest classifier to annotate cells in a two step process, first identifying plasma cells based on a restricted list of genes abundantly expressed in these cells and subsequently assigning the remaining cells to one of 33 cell types. We performed a five-fold cross validation of the model and subsequently determined the accuracy of our classifier to be 92% on the test subset of the HCA data. Applied to our AML cell collection, a total of 35 053 cells (34%) were unbiasedly classified as Hematopoietic Stem Cell (HSC)-like, corresponding to the most phenotypically immature leukemic cells in each patient sample (ranging from 4 to 74 %). Accordingly, HSC-like AML cells preferentially express genes associated with normal HSCs, such as CD34, FAM30A, and SPINK2, and globally lack expression of mature lineages defining genes, further validating our classifier. The proportion of HSC-like cells varied among AML subgroups, and was lowest in KMT2A-r AML (median 19%) and highest in abn5/7 samples (46%). Integration of our AML atlas using Harmony algorithm preserved differentiation hierarchies across samples, with most cell types, including HSC-like cells, occupying a defined area in the low dimensional embedding. To identify new surface antigens specifically expressed in immature leukemic cells, we compared the high (≥8) SPAT score gene expression profile of AML HSC-like cells with that of normal HSC cells (HCA), and identified 60 genes significantly overexpressed in AML immature cells. Of those, 39 genes were also detected at the protein level by the surfaceome analysis, supporting their predicted expression at the cell surface in AML samples. 59% of these 39 genes (n=23) were detected in over 80% of the specimens analyzed by the surfaceome, and thus are nearly universally expressed in our AML cohort. To identify targets of therapies that could be repurposed, we next evaluated the relevance of our findings by querying the Thera-SAbDab database. Most interestingly, 8 of the 39 AML specific HSC markers are targeted by therapeutic antibodies FDA-approved or in clinical trials for the treatment of AML (n=4, IL3RA, FLT3, CD37 and TNFRSF10B) or other indications (n = 4). Conclusion Our genetically diverse AML single-cell atlas, supported by mass spectrometry, enables the identification of both subset-specific and pan-AML surface protein genes. These represent potential targets for antibody based strategy development or therapy repurposing in AML.
Read full abstract