Single cell genomics has revolutionised our understanding of differentiating systems and led to a reinterpretation of haematopoiesis, moving from a step-wise hierarchical tree model towards more continous differentiation landscapes. Recent single cell RNA sequencing (scRNA-seq) studies of Lineage- ckit+ (LK) mouse bone marrow progenitors have defined a highly granular landscape of differentiation from multipotent stem cells to committed progenitors in 8 different lineages and have served as a valuable reference landscape for comparisons with perturbation states. While these landscapes have permitted exploration and interrogation of early haematopoietic progenitors, as yet, no map of mouse total bone marrow haematopoiesis exists at single cell resolution, limiting efforts to fully understand the gene programs which specify and the cell surface phenotypes which identify maturing haematopoietic lineages nor the connection between early bifurcation decisions in progenitor cells and mature cellular outputs. To answer these questions, we performed droplet-based scRNA-seq and cell surface proteomics on whole bone marrow from 4 male and 4 female 12-week old C57Bl6 mice, using a panel of 138 antibodies against cell surface antigens. To increase representation of rare progenitor populations, we complemented total bone marrow mononuclear cells with c-kit enrichment and FACS sorted LK and LK sca1-positive (LSK) populations. After quality control filtering the single cell transcriptomes and proteomes, over 198,000 single cells were analysed separately before multimodal integration into a single UMAP embedding. Addition of cell surface proteome information to the scRNA-seq data through multimodal integration produced a representation with greater resolution of haematopoietic cell types, including improved resolution of T-cell subtypes and separation of a rare population of eosinophil progenitors which were previously indistinguishable from basophil progenitors. This integrated landscape of mouse haematopoiesis was then clustered and annotated though an iterative process of label transfer and manual annotation into 41 cell type states (Figure 1). Using cell fate analysis, which utilises transcriptional similarity and pseudotemporal ordering to estimate the probability of a single cell's commitment to defined terminal states, we were able to resolve cell fate probabilities towards nine haematopoietic lineages (erythroid, neutrophil, megakaryocyte, lymphoid, monocyte, dendritic cell (DC), plasmacytoid dendritic cell (pDC), basophil and eosinophil (Figure 2)). From these probabilities we defined nine trajectories from the earliest uncommitted progenitors to terminally differentiated populations, including a pDC trajectory involving both lymphoid and common DC progenitors. Analysis of trajectory-based gene expression trends provides a framework for discovery of lineage drivers as well as shared gene expression programs across multiple lineages. Leveraging cell surface proteome data permits the use of ‘in-silico-FACS’ to describe phenotypes of cell populations which exist at cell fate branch-points, for further in-vitro characterisation. To demonstrate the utility of our atlas as a reference, we projected perturbation datasets from 7 pre-leukaemic mouse models onto the atlas, allowing quantitation of mutation-associated cellular and tissue-scale alterations. Finally, we have leveraged the transcription-start site enrichment of 5' scRNA-seq to compute enhancer RNA (eRNA) expression in our atlas, allowing the identification of dynamic usage of specific enhancer elements during haematopoietic specification and thus providing a novel framework for correlation of eRNA expression (as a proxy for active distal chromatin regions) with gene expression trends along specific haematopoietic lineages.
Read full abstract