The therapeutic effect of allo-HSCT relies on T cell alloreactivity driven by genetic nonidentity between donor and recipient (D-R) pairs. In the setting of HLA-matched transplants, endogenous proteins in recipient cells differing from those of the donor due to genetic polymorphisms, can provide distinct HLA-binding peptides, serving as minor histocompatibility antigens (mHAgs). Hematopoietically-restricted mHAgs represent ideal targets to separate graft-versus-leukemia (GvL) from graft-versus-host-disease (GvHD) effects. To date,however, the quest for optimal GvL mHAgs has resulted in the identification of only a handful of targets, raising concerns on the clinical feasibility of their targeting. We hypothesized that systematic mHAg prediction from whole-exome sequencing (WES) of D-R pairs could help identifying novel candidate mHAgs with an acceptable safety profile for therapeutical targeting. We focused on AML/MDS, the most common indications for allo-HSCT in adults. To capture the transcriptional heterogeneity of malignant myeloid cells, we applied a single-cell based classifier to define genes expressed by leukemic cells in the Beat AML cohort, and then removed those expressed in non-hemopoietic tissues, per the GTEx RNA and protein database. Similarly, a broader set of pan-hematopoietic genes was defined starting from the expression profiles of 18 mature hematopoietic lineages and hematopoietic stem and precursor cells. Through this process, we identified 259 genes with preferential expression in AML and 615 broadly expressed in the hematopoietic compartment. We applied this analytical pipeline to WES from 220 HLA-matched D-R pairs transplanted at DFCI. To discover mHAgs with optimal GvL potential, we focused on the subgroup of 45 patients with long-term survival in the absence of both relapse and GvHD requiring systemic treatment (i.e. GvHD-free/relapse-free survival [GRFS]). For these patients experiencing purely GvL, we analyzed the individual repertoire of predicted GvL mHAgs, and found 86 SNPs recurring in ≥5 GRFS patients. These GvL SNPs were enriched in the subgroup of GRFS patients versus those without GRFS, suggesting that their overrepresentation was not dictated by high allelic frequency in the overall cohort. Using the median number of recurring GvL mHAgs in the GRFS subgroup (‘GRFS mHAgs‘) as a cutoff to stratify the entire study cohort, we found that a higher GRFS mHAg load was associated with a protective effect against relapse ( p=0.009) and conferred a benefit in 2-year overall survival ( p=0.03). Furthermore, HLA class I immunopeptidome analysis of 5 AML cell lines revealed evidence of presentation for 10 epitopes predicted from genes in the overall GvL filter, two of which were also part of the GRFS mHAg set. Given the limited size and lack of ethnic diversity of our discovery cohort (composed mainly of patients of European ancestry), we used genomic data from the 1000 Genomes project to estimate the feasibility of targeting our GRFS mHAg set in a broader population via in silico modeling of allo-HSCT. We simulated 2270 D-R pairs from 844 individuals across 5 populations of disparate ancestry (AFR, EAS, EUR, SAS, and AMR), mirroring a typical URD search in the NMDP registry. To gauge the feasibility of generating personalized mHAg-specific immunotherapy (i.e. cancer vaccines, adoptive cellular therapy) based on the GRFS mHAg set, we estimated that a 10-SNP design (i.e., the number of GRFS SNPs used for correlative analyses in our discovery cohort) would be possible for ~90% of the 1000 Genomes simulated cohort (EUR: 99%, EAS: 72%, SAS: 91%, AFR: 69%, AMR: 74%). Through a similar process, we calculated population coverage simulating a T cell-based approach targeting ≥1 epitope from the GRFS mHAg set. First, we identified 18 HLA alleles that could ensure ~99% coverage across diverse ethnic backgrounds. Second, for each HLA allele, the 3 most frequently predicted (and with highest agretopicity) epitopes from the GRFS SNP set were selected, defining a pool of 54 epitopes. We thus calculated that ≥1 epitopes could be potentially targeted for 81% of the simulated D-R pairs (EUR: 91%, EAS: 78%, SAS: 71%, AFR: 62%, and AMR:85%). Overall, we developed a robust approach to identify novel GvL mHAgs, which could form the basis for systematic personalized immunotherapy following allo-HSCT.
Read full abstract