Abstract

Adult stem-cells may serve as the cell-of-origin for cancer, yet their unbiased identification in single cell RNA sequencing data is challenging due to the high dropout rate. In the case of breast, the existence of a bipotent stem-like state is also controversial. Here we apply a marker-free algorithm to scRNA-Seq data from the human mammary epithelium, revealing a high-potency cell-state enriched for an independent mammary stem-cell expression module. We validate this stem-like state in independent scRNA-Seq data. Our algorithm further predicts that the stem-like state is bipotent, a prediction we are able to validate using FACS sorted bulk expression data. The bipotent stem-like state correlates with clinical outcome in basal breast cancer and is characterized by overexpression of YBX1 and ENO1, two modulators of basal breast cancer risk. This study illustrates the power of a marker-free computational framework to identify a novel bipotent stem-like state in the mammary epithelium.

Highlights

  • Adult stem-cells may serve as the cell-of-origin for cancer, yet their unbiased identification in single cell RNA sequencing data is challenging due to the high dropout rate

  • Single-cell RNA-sequencing studies are revolutionizing our understanding of cellular development, helping us elucidate the hierarchical organization of celltypes within complex tissues and how this organization may be altered in diseases like cancer[1,2,3,4,5,6,7,8,9,10,11,12,13,14]

  • We show that these outstanding challenges can be overcome with a marker-free system biology approach, called LandSCENT (Landscape of Single Cell Entropy), which builds upon our SCENT framework[30] to assign each cell, to a specific cell-type, and to a specific potency/entropy state

Read more

Summary

Results

Rationale for a marker-free approach to identify stem-like cells. We reanalyzed scRNA-Seq data from a previous study that used the 10X Genomics Chromium assay to profile over 25,000 mammary epithelial cells from four nulliparous healthy women[34]. We verified that the median dropout rate per cell was over 90% for each of the four women, affecting some of the proposed stemness markers like ALDH1A1, ZEB1, and TCF434,35 (Supplementary Fig. 1A, B). We observed that only for one of the four women (denoted “Ind-4”) did the top principal component of variation correlate with expression of basal and luminal markers (Supplementary Fig. 2). Performing t-SNE38 followed by density-based spatial clustering[39] revealed three main single-cell clusters (Fig. 2a, the “Methods” section), in line with previous observations[34], and consistent with known biology: one cluster expressed high levels of KRT14, a well-known basal marker, whereas the other two expressed KRT18, a well-known luminal marker (Fig. 2b). Using LandSCENT to identify high-potency cell-types in complex epithelial tissues

Combine to identify distribution of potency states among cell types
Discussion
Methods
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call