ScTab: Scaling cross-tissue single-cell annotation models

Felix Fischer,David S Fischer,Roman Mukhin,Andrey Isaev,Evan Biederstedt,Alexandra-Chloé Villani,Fabian J Theis

doi:10.1038/s41467-024-51059-5

Abstract

Identifying cellular identities is a key use case in single-cell transcriptomics. While machine learning has been leveraged to automate cell annotation predictions for some time, there has been little progress in scaling neural networks to large data sets and in constructing models that generalize well across diverse tissues. Here, we propose scTab, an automated cell type prediction model specific to tabular data, and train it using a novel data augmentation scheme across a large corpus of single-cell RNA-seq observations (22.2 million cells). In this context, we show that cross-tissue annotation requires nonlinear models and that the performance of scTab scales both in terms of training dataset size and model size. Additionally, we show that the proposed data augmentation schema improves model generalization. In summary, we introduce a de novo cell type prediction model for single-cell RNA-seq data that can be trained across a large-scale collection of curated datasets and demonstrate the benefits of using deep learning methods in this paradigm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: Nature Communications	Publication Date: Aug 4, 2024
Citations: 2	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

ScTab: Scaling cross-tissue single-cell annotation models

Abstract

Talk to us

Similar Papers

More From: Nature Communications

Lead the way for us

Similar Papers

One-shot segmentation of novel white matter tracts via extensive data augmentation and adaptive knowledge transfer.
Wan Liu ... Yaou Liu
Medical Image Analysis | VOL. 90
Wan Liu, et. al.Wan Liu ... Yaou Liu
01 Dec 2023
Medical Image Analysis | VOL. 90

Stochastic sampling algorithms for state estimation of jump Markov linear systems
A Doucet ... V Krishnamurthy
IEEE Transactions on Automatic Control | VOL. 45
A Doucet, et. al.A Doucet ... V Krishnamurthy
01 Jan 1999
IEEE Transactions on Automatic Control | VOL. 45

DAAL-WS: A weakly-supervised method integrated with data augmentation and active learning strategies for MLS point cloud semantic segmentation
Xiangda Lei ... Lingfei Ma
International Journal of Applied Earth Observation and Geoinformation | VOL. 131
Xiangda Lei, et. al.Xiangda Lei ... Lingfei Ma
13 Jun 2024
International Journal of Applied Earth Observation and Geoinformation | VOL. 131

A review of synthetic and augmented training data for machine learning in ultrasonic non-destructive evaluation
Sebastian Uhlig ... Matthias Wolff
Ultrasonics | VOL. 134
Sebastian Uhlig, et. al.Sebastian Uhlig ... Matthias Wolff
18 May 2023
Ultrasonics | VOL. 134

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

ScTab: Scaling cross-tissue single-cell annotation models

Abstract

Talk to us

Similar Papers

More From: Nature Communications