Abstract The Single-cell Pediatric Cancer Atlas (ScPCA) Portal (https://scpca.alexslemonade.org/), developed and maintained by the Childhood Cancer Data Lab, is an open-source data resource for single-cell and single-nuclei RNA sequencing data of pediatric tumors. Originally comprised of data from 10 projects funded by Alex’s Lemonade Stand Foundation, the Portal currently contains summarized gene expression data for over 500 samples from a diverse set of over 50 types of cancers and is growing to include community-contributed datasets. In addition to gene expression data from single-cell and single-nuclei RNA sequencing, the Portal holds data obtained from bulk RNA sequencing, spatial transcriptomics, and feature barcoding methods, such as CITE-seq and cell hashing. ScPCA data are available for download in formats ready for downstream analysis, such as SingleCellExperiment or AnnData objects. Objects include raw counts and normalized gene expression data, PCA and UMAP coordinates, unsupervised clustering assignments, and cell type annotations. Additionally, all downloads include two summary reports for each library. The quality control report is a general overview, including processing information, summary statistics, and general visualizations of cell metrics. The cell type annotation report includes an overview of cell type annotation, comparisons among cell type annotation methods, and diagnostic plots to assess annotation quality. Comprehensive documentation about data processing and the contents of files on the portal, including a guide to getting started working with an ScPCA dataset, can be found at scpca.readthedocs.io. All data on the Portal were uniformly processed using scpca-nf, a Nextflow-based open-source pipeline developed by the Childhood Cancer Data Lab. The scpca-nf workflow uses alevin-fry for fast and efficient processing of all data currently available on the portal, including single-cell RNA-seq data and any associated CITE-seq or cell hash data, spatial transcriptomics data, and bulk RNA sequencing. The workflow and associated documentation are freely available at https://github.com/AlexsLemonade/scpca-nf, allowing researchers to leverage this pipeline for their own datasets. Providing an open-source workflow has allowed researchers to process their own single-cell or single-nuclei datasets for their own research. Furthermore, any data sets processed with scpca-nf are then eligible for inclusion on the ScPCA Portal. The continuous growth of the ScPCA Portal will help pediatric cancer researchers spend less time finding and processing data and more time answering their pressing research questions. Citation Format: Allegra G. Hawkins, Joshua A. Shapiro, Stephanie J. Spielman, David S. Mejia, Deepashree Venkatesh Prasad, Nozomi Ichihara, Arkadii Yakovets, Kurt G. Wheeler, Chanté J. Bethell, Steven M. Foltz, Jennifer O'Malley, Casey S. Greene, Jaclyn N. Taroni. The Single-cell Pediatric Cancer Atlas: Open-source data and tools for single-cell transcriptomics of pediatric tumors [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 2859.
Read full abstract