Characterizing efficient feature selection for single-cell expression analysis.

Juok Cho,Bukyung Baik,Hai C T Nguyen,Daeui Park,Dougu Nam

doi:10.1093/bib/bbae317

Abstract

Unsupervised feature selection is a critical step for efficient and accurate analysis of single-cell RNA-seq data. Previous benchmarks used two different criteria to compare feature selection methods: (i) proportion of ground-truth marker genes included in the selected features and (ii) accuracy of cell clustering using ground-truth cell types. Here, we systematically compare the performance of 11 feature selection methods for both criteria. We first demonstrate the discordance between these criteria and suggest using the latter. We then compare the distribution of selected genes in their means between feature selection methods. We show that lowly expressed genes exhibit seriously high coefficients of variation and are mostly excluded by high-performance methods. In particular, high-deviation- and high-expression-based methods outperform the widely used in Seurat package in clustering cells and data visualization. We further show they also enable a clear separation of the same cell type from different tissues as well as accurate estimation of cell trajectories.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Characterizing efficient feature selection for single-cell expression analysis.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics

Lead the way for us

Journal: Briefings in bioinformatics	Publication Date: May 23, 2024
License type: CC BY 4.0

Similar Papers

FastProject: a tool for low-dimensional analysis of single-cell RNA-Seq data
David Detomaso ... Nir Yosef
BMC Bioinformatics | VOL. 17
David Detomaso, et. al.David Detomaso ... Nir Yosef
23 Aug 2016
BMC Bioinformatics | VOL. 17

A Tool for Visualization and Analysis of Single-Cell RNA-Seq Data Based on Text Mining
Gennaro Gambardella ... Diego Di Bernardo
Frontiers in Genetics | VOL. 10
Gennaro Gambardella, et. al.Gennaro Gambardella ... Diego Di Bernardo
09 Aug 2019
Frontiers in Genetics | VOL. 10

ScEFSC: Accurate single-cell RNA-seq data analysis via ensemble consensus clustering based on multiple feature selections
Chuang Bian ... Xiangtao Li
Computational and Structural Biotechnology Journal | VOL. 20
Chuang Bian, et. al.Chuang Bian ... Xiangtao Li
01 Jan 2021
Computational and Structural Biotechnology Journal | VOL. 20

A machine learning approach for the identification of key markers involved in brain development from single-cell transcriptomic data.
Yongli Hu ... Samik Ghosh
BMC Genomics | VOL. 17
Yongli Hu, et. al.Yongli Hu ... Samik Ghosh
01 Dec 2016
BMC Genomics | VOL. 17

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Characterizing efficient feature selection for single-cell expression analysis.

Abstract

Talk to us

Similar Papers

More From: Briefings in bioinformatics