Abstract

Dear Editor, The origin and the phenotypic heterogeneity of cancer-associated fibroblasts (CAFs) are suggested by various models, but not completely understood.1-3 We used six publicly available single-cell RNA sequencing (scRNA-seq) datasets of five cancer types (except breast cancer) on CAFs and corresponding normal fibroblasts (NFs) (Figure 1A)4-6 and established a comprehensive model for CAF development and gene expression dynamics over time. The fibroblast fraction constituted less than 10% of all cellular components in each dataset (Figure 1B). This relatively low fraction may be ascribed to our two-step, strict procedure for defining fibroblasts. Based on the global gene expression patterns, breast cancer CAFs were markedly different from CAFs from other organs (Figure 1C). K-means clustering with optimal k number calculated using the sum of squared error for each sample and subsequent principal component analyses revealed the presence of several CAF and NF clusters (Figure S1). Based on the recent discovery of PRRX1 as a critical regulator of the fibroblast-specific key transcriptional network,7 we examined PRRX1 expression in each cluster. None of the NFs exhibited PRRX1 activity, whereas certain CAF clusters showed significantly high PRRX1 expression (Figure 1E). Furthermore, the known CAF-related genes in various functional categories were upregulated only in the CAF clusters with high PRRX1 activity (Figure 1F). Thus, we labeled these CAFs as “perpetually activated CAFs” (paCAFs), which were constituted approximately 50%–80% of all CAFs in each dataset (Figure 1D). Bone marrow-derived mesenchymal stem cells (BM-MSCs) or local tissue-resident (tr)-fibroblasts were suggested as the primary source of CAFs; therefore, we examined BM-MSC markers.2, 8 CAFs generally express higher levels of BM-MSC markers than NFs (Figure S2). Subgroup analysis revealed that the CAF clusters with higher BM-MSC marker expression only comprised paCAFs (Figures 2A and 2B). Meanwhile, since one NF cluster from each set also showed high BM-MSC marker expression, we named these “tr-MSC-like fibroblasts” (tr-MSCFs) (Figures 2C and 2D). We confirmed that both tr-MSCFs and paCAFs showed lack of expression of hematopoietic stem cell markers (Figure S3), implying that they were derived from BM-MSCs. Owing to the relatively low transcriptional activities in the remaining NFs (Figure S4), they were named tr-resting fibroblasts (tr-RFs). These cells were considered terminally differentiated mature tissue fibroblasts with no phenotypical plasticity. However, they may have originated from tr-MSCFs, at least partly. In the paCAF group investigation, one paCAF cluster showed a myofibroblastic (my)CAF signature (Figures 2E and 2F), whereas the other showed an inflammatory (i)CAF signature.9 Further subgroup analysis using unsupervised k-means clustering on colon and lung cancer dataset paCAFs revealed two subclusters with a myCAF or iCAF signature (Figures 2G-2I, and S5A). Next, to trace the origin of paCAFs, we performed all possible pairwise correlation analyses. In colon and lung cancers, paCAFs showed exclusive correlation with tr-MSCFs, whereas tr-RFs were associated with non-paCAFs (Figures 3A and S6A). Trajectory inference analyses indicated the progressive differentiation of tr-MSCFs into paCAFs (Figure 3B). This suggested that the paCAFs can be originated from tr-MSCFs in the adjacent normal tissue, although the split trajectories in these datasets also suggested that some paCAFs may have originated from tr-RFs or non-paCAFs. Analyses of the transition between the two paCAF subclusters revealed a significantly longer pseudotime in myCAFs than in iCAFs, indicating serial gene expression transition (Figure 3C). RNA velocity analysis also supported the paths of the tr-MSCF-iCAF-myCAF axis (Figure 3D). These findings were reproduced in independent “colon set 2″ (Figure 3E). In lung cancer, myCAFs and iCAFs showed narrow pseudotime zones, and the transition from iCAFs to myCAFs was unremarkable, probably owing to the low number of fibroblasts (Figure S6B). The tr-MSCFs were also exclusively correlated with paCAFs in the stomach and ovaries (Figure 3F). In addition, trajectory analysis revealed that tr-MSCFs, iCAFs, and myCAFs were on a continuous progression with increasing pseudotime in stomach cancer (Figures 3G-3I), indicating that tr-MSCFs are first transformed into iCAFs and then to myCAFs (Figure 3J). In ovarian cancer, two distinct paths of tr-MSCF progression, related to the peritoneum and omentum, were observed (Figure 3K). The iCAFs in ovarian cancer were located halfway along the trajectory from the tr-MSCFs to myCAFs, and the tr-MSCFs and iCAFs were mixed in the front end of the continuous line of increasing pseudotime (Figures 3L and 3M), suggesting two different routes of development: (1) tr-MSCFs to myCAFs via iCAFs, or (2) direct transition of tr-MSCFs to myCAFs (Figure 3N). We validated the trajectory analyses of CAFs using Slingshot (Figure S6C). The gene expression profiles of tr-MSCFs were more similar to those of iCAFs compared with myCAFs (Figures 3O and S7A). In addition, using publicly available BM-MSC data (GSE147287),10 we identified that BM-MSCs, like tr-MSCFs, showed higher iCAF-related genes expressions (Figure 3P). Correlation analysis between BM-MSCs and the fibroblasts (colon set1) showed that subset of BM-MSCs were best correlated with tr-MSCFs (Figure 3Q) and were indicated as a source of tr-MSCFs as well as paCAFs by trajectory analysis (Figure 3R) after batch correction (Figure S7B), which suggested the developmental paths of CAFs, starting from BM-MSCs to myCAFs (Figure 3S). Additionally, we extracted "genuine CAF signature” based on our model in colon cancer and was associated with poorer prognosis independent of the stage (Figure S8 and Table S1). Collectively, during cancer development, tr-MSCFs migrate to the cancer site and transform into paCAFs (first to iCAFs and then to myCAFs). BM-MSCs are simultaneously recruited to cancer sites in response to signaling cues from cancer cells and then transform into iCAFs, or subsequently into myCAFs. Meanwhile, with the increase in neoplasms, the dominant cancer niche gradually engulfs the tr-RF territory. As tr-RFs are terminally differentiated mature fibroblasts, they do not transform into paCAFs at the cancer site and constitute a group of NF-like cells among CAFs (Figure 4). Summary of the suggested developmental paths of perpetually activated cancer-associated fibroblasts (paCAFs) Abbreviations: BM-MSC, bone marrow-derived mesenchymal stem cells; iCAF, inflammatory CAF; myCAF, myofibroblastic CAF; tr-MSCF, tissue resident-mesenchymal stem cell-like fibroblasts; tr-RF, tissue resident-resting fibroblasts. There were several drawbacks to this study. First, owing to the strict procedures for defining fibroblasts, we may have eliminated potential fibroblast populations. Second, as paCAFs were identified based on the expression of PRRX1 and other CAF-related markers, the results did not fully reflect the activation status of heterogeneous CAFs. However, we believe that our findings provide valuable insights for future studies. We acknowledge the data provided by the Samsung Medical Center, VIB-KU Leuven Center for Cancer Biology, and Stanford University School of Medicine. The authors declare that they have no competing interests. This study was approved by the IRB at Asan Medical Center, Seoul. Chang Ohk Sung, Seok-Hyung Kim, and Dakeun Lee conceived the project and provided leadership. Chang Ohk Sung and Dakeun Lee designed the study. Hee Chul Chung and Chang Ohk Sung analyzed the genomic data and interpreted the results. Dakeun Lee, Eun Jeong Cho, Hyeonjin Lee, Won-Kyung Kim, Ji-Hye Oh, and Seok-Hyung Kim contributed to materials, analysis, and interpretation of the data. Dakeun Lee, Chang Ohk Sung, and Hee Chul Chung wrote the manuscript. All authors reviewed and approved the final manuscript. This study was supported by the Basic Science Research Program through the National Research Foundation of Korea (NRF), funded by the Ministry of Science, ICT & Future Planning (NRF-2019R1A2C1084460 and NRF-2021R1A2C2005853), the Bio and Medical Technology Development Program of the NRFK (NRF-2019M3E5D4066900) of the Korean government, and grant 2021IP0013 from the Asan Institute for Life Sciences of Asan Medical Center, Korea. Count matrices and metadata for pan-cancer scRNA-seq data obtained from 36 patients are available at http://blueprint.lambrechtslab.org/. Processed scRNA-seq data and metadata for 23 Korean patients with colorectal cancer are available in the NCBI Gene Expression Omnibus (GEO) database under the accession code GSE132465. Filtered stomach cancer scRNA-seq data are available at https://dna-discovery.stanford.edu/research/datasets/. Filtered BM-MSC scRNA-seq data are available in the GEO database under the accession code GSE147287. Supporting Information files Supplementary Methods (PDF) Please note: The publisher is not responsible for the content or functionality of any supporting information supplied by the authors. Any queries (other than missing content) should be directed to the corresponding author for the article.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call