Transformer for one stop interpretable cell type annotation

Jiawei Chen,Wanyu Tao,Yuxuan Zhao,Hao Xu,Zhaoxiong Chen,Jing-Dong J Han

doi:10.1038/s41467-023-35923-4

Jiawei Chen, Wanyu Tao + Show 4 more

Open Access

https://doi.org/10.1038/s41467-023-35923-4

Copy DOI

Journal: Nature Communications	Publication Date: Jan 14, 2023
Citations: 38	License type: open-access

Affiliation: Center for Life Sciences, Peking University

Abstract

Consistent annotation transfer from reference dataset to query dataset is fundamental to the development and reproducibility of single-cell research. Compared with traditional annotation methods, deep learning based methods are faster and more automated. A series of useful single cell analysis tools based on autoencoder architecture have been developed but these struggle to strike a balance between depth and interpretability. Here, we present TOSICA, a multi-head self-attention deep learning model based on Transformer that enables interpretable cell type annotation using biologically understandable entities, such as pathways or regulons. We show that TOSICA achieves fast and accurate one-stop annotation and batch-insensitive integration while providing biologically interpretable insights for understanding cellular behavior during development and disease progressions. We demonstrate TOSICA’s advantages by applying it to scRNA-seq data of tumor-infiltrating immune cells, and CD14+ monocytes in COVID-19 to reveal rare cell types, heterogeneity and dynamic trajectories associated with disease progression and severity.

Full Text