Abstract
BackgroundMany functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.ResultsTo address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.ConclusionsOur analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.
Highlights
Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data
Robustness of bulk-based transcription factor (TF) and pathway analysis tools against low gene coverage Single-cell RNA-seq profiling is hampered by low gene coverage due to drop-out events [23]
We aimed to explore how DoRothEA, PROGENy, and Gene Ontology (GO) gene sets combined with Gene Set Enrichment Analysis (GSEA) (GO-GSEA) can handle low gene coverage in general, independently of other technical artifacts and characteristics from scRNA-seq protocols
Summary
Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. ScRNA-seq data has characteristics such as drop-out events and low library sizes. It is not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way. Thanks to diverse high-throughput techniques, such as microarrays and RNA-seq, expression profiles can be collected relatively and are very common. To extract functional and mechanistic information from these profiles, many tools have been developed that can, for example, estimate the status of Functional analysis tools typically combine prior knowledge with a statistical method to gain functional and mechanistic insights from omics data. There is a growing number of statistical methods spanning from simple linear models to advanced machine learning methods [8, 9]
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.