I.I., A.G.X.Z, Q.G., L.G.P.: co-first authors; J.E.D., C.G.M.: co-senior authors Introduction: Genomic analyses of bulk ALL samples have improved our understanding of the genetic basis and risk stratification of B-ALL, but do not directly examine intratumor heterogeneity or enable inference of leukemia developmental state and cell of origin. Methods: We profiled 89 B-ALL samples by single-cell RNA-seq (scRNA-seq) (10X Genomics 5'v2) and compared them to a scRNA-seq reference map of normal human B-cell development. Results: Analysis of heterogeneity of inferred DNA copy number alterations at single cell level showed that aneuploid ALL with near haploidy and hyperdiploid harbor chromosomal losses or gains in all blasts, consistent with early, synchronous origin rather than sequential accumulation. A subset of TCF3::PBX1 samples harbored clones with progressive overexpression of genes surrounding PBX1 which scWGS confirmed caused by accumulation of DNA gains. Extensive intra-sample heterogeneity was observed by non-negative matrix factorization and driven by five consensus gene expression signatures: cell cycle (S; G2/M), metabolism, differentiation and inflammation. Signature composition refined leukemic subtyping; for example, within DUX4-r B-ALL two groups were discerned: one with higher inflammation score, expression of stem cell genes and worse outcome and one with lower inflammation score and expression of Pro-B related genes. To understand variation in developmental states among B-ALL blasts, we first developed an atlas of human B cell development. Marker genes elucidated from bulk RNA-seq on purified human HSC, MPP, LMPP, MLP, CLP, Pre-Pro-B, Pro-B, Pre-B, and B populations guided the development of a new scRNA-seq atlas of human B cell development comprising 130,085 cells spanning 90 fetal, pediatric, and adult samples from 8 studies (Fig 1A). Unexpectedly, we found that human CLPs retain transcriptional programs governed by CEBPA; in vitro and in vivo functional assays validated the capacity of CLPs for extensive myeloid differentiation prior to lineage restriction at the Pro-B stage. We next mapped our 89 B-ALL sc samples to precise cellular states along B cell development and clustered samples by leukemia cell composition. As expected, most samples exhibited similarity of GEP to that of pro-B cells, and this was most pronounced among hyperdiploid samples. High pre-B abundance was observed in MEF2D-r and TCF3::PBX1, while an HSC/MPP/LMPP enriched cluster encompassed ZNF384-r and a subset of DUX4-r B-ALL. Notably, one ZNF384-r B-ALL patient with LMPP involvement at diagnosis retained these LMPP-like cells at relapse despite a lineage switch to AML. Finally, we identified a group of patients with high early lymphoid (MLP, CLP, Pre-Pro-B) abundance representing a subset of Ph+ and KMT2A-r samples (Fig. 1B). We next utilized marker genes from each B-ALL cell state to estimate their relative abundance within a bulk RNA-seq cohort of 2046 B-ALL patients. In addition to validating these associations between genomic class and leukemia cell state, we identified age-dependent patterns of B-ALL state involvement wherein early lymphoid involvement is highest in infancy and adulthood while Pro-B involvement is highest in childhood. In childhood, high-risk disease is associated with higher early lymphoid ( P=4.9e-12) and lower Pro-B ( P=0.0085) abundance. Further, patients with residual disease (RD) levels above 1% are enriched for higher upfront early lymphoid abundance ( P=1e-7) and lower Pro-B ( P=4.1e-5) and Pre-B ( P=2.2e-11) abundance compared to patients with RD < 0.01%. Finally, B-ALL cell state improved resolution of previously reported transcriptional heterogeneity within defined genomic classes. Early lymphoid abundance discerns between two subtypes within Ph+ (Kim Nat Gen 2023) with AUC=0.953 ( P=2.4e-9), as well as two subtypes within KMT2A-r (Brady Nat Gen 2022) with AUC=0.996 ( P=1.1e-6). Notably we find that KMT2A::AFF1 fusions produce early lymphoid-enriched disease compared to KMT2A::MLLT3 ( P=0.0024) and KMT2A::MLLT10 ( P=0.0037), wherein the latter two fusions exhibit Pre-B enrichment ( P=0.025 and P=0.0037, respectively). Conclusions: Understanding variation in transcriptional programs and developmental states of B-ALL blasts by sc transcriptome refines existing clinical and genomic classifications and provides novel prognostic markers.
Read full abstract