PipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

Pierre-Luc Germain,Anthony Sonrel,Mark D Robinson

doi:10.1186/s13059-020-02136-7

Pierre-Luc Germain, Anthony Sonrel + Show 1 more

Open Access

https://doi.org/10.1186/s13059-020-02136-7

Copy DOI

Abstract

We present pipeComp (https://github.com/plger/pipeComp), a flexible R framework for pipeline comparison handling interactions between analysis steps and relying on multi-level evaluation metrics. We apply it to the benchmark of single-cell RNA-sequencing analysis pipelines using simulated and real datasets with known cell identities, covering common methods of filtering, doublet detection, normalization, feature selection, denoising, dimensionality reduction, and clustering. pipeComp can easily integrate any other step, tool, or evaluation metric, allowing extensible benchmarks and easy applications to other fields, as we demonstrate through a study of the impact of removal of unwanted variation on differential expression analysis.

Highlights

Single-cell RNA-sequencing and the set of attached analysis methods are evolving fast, with more than 560 software tools available to the community [1], roughly half of which are dedicated to tasks related to data processing such as clustering, ordering, dimension reduction, or normalization
Optional benchmark functions can be set for each step to provide standardized, multi-layered evaluation metrics
We investigated the impact of several filtering methods: four methods based on deviations to median absolute deviations (MADs) with increasing levels of stringency and two methods based on scater’s runPCA using all or selected covariates

Summary

Introduction

Single-cell RNA-sequencing (scRNAseq) and the set of attached analysis methods are evolving fast, with more than 560 software tools available to the community [1], roughly half of which are dedicated to tasks related to data processing such as clustering, ordering, dimension reduction, or normalization. A number of good comparison and benchmark studies have already been performed on various steps related to scRNAseq processing and analysis and can guide the choice of methodology [3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21] These recommendations need constant updating and Germain et al Genome Biology (2020) 21:227 often leave open many details of an analysis. It is critical to evaluate the single effect of a preprocessing method and its positive or negative interaction with all parts of a workflow

Methods

Results

Discussion

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Genome biology	Publication Date: Sep 1, 2020
Citations: 75	License type: open-access

R Discovery Prime

R Discovery Prime

PipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome biology

Lead the way for us

Similar Papers

A three-stage unsupervised dimension reduction method for text clustering
Kusum Kumari Bharti ... P.K Singh
Journal of Computational Science | VOL. 5
Kusum Kumari Bharti, et. al.Kusum Kumari Bharti ... P.K Singh
04 Dec 2013
Journal of Computational Science | VOL. 5

ScEFSC: Accurate single-cell RNA-seq data analysis via ensemble consensus clustering based on multiple feature selections
Chuang Bian ... Xiangtao Li
Computational and Structural Biotechnology Journal | VOL. 20
Chuang Bian, et. al.Chuang Bian ... Xiangtao Li
01 Jan 2021
Computational and Structural Biotechnology Journal | VOL. 20

MCDM-EFS: A novel ensemble feature selection method for software defect prediction using multi-criteria decision making
Kamaldeep Kaur ... Ajay Kumar
Intelligent Decision Technologies | VOL. 17
Kamaldeep Kaur, et. al.Kamaldeep Kaur ... Ajay Kumar
20 Nov 2023
Intelligent Decision Technologies | VOL. 17

Predictive QSAR Models for Polyspecific Drug Targets: The Importance of Feature Selection
Michael Demel ... Wilfried Gansterer
Current Computer Aided-Drug Design | VOL. 4
Michael Demel, et. al.Michael Demel ... Wilfried Gansterer
01 Jun 2008
Current Computer Aided-Drug Design | VOL. 4

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

PipeComp, a general framework for the evaluation of computational pipelines, reveals performant single cell RNA-seq preprocessing tools

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Genome biology