Abstract Background: Analysis of samples at the single-cell level offers insights into cellular heterogeneity and cell function. Cell type annotation is the first critical step for performing such an analysis. While current methods primarily utilize single-cell RNA sequencing (scRNA-seq) for annotation, several studies have demonstrated improved classification accuracy by combining scRNA-seq with transposase-accessible chromatin sequencing (ATAC-seq) using unsupervised methods. However, the utility of ATAC-seq features for supervised cell-type annotation has not been explored. Aims/Objectives: The objective of this study was to evaluate the relative performance of supervised cell-type classification using scRNA-seq alone vs. in multimodal combination with ATAC-seq; and how these data interplay with choice of classification and dimensionality reduction methods. Methods: A peripheral-blood mononuclear cell multi-omic dataset from a single, healthy female donor wasanalysed in this study. Ground truth annotations were generated using unsupervised annotation with the weighted nearest neighbour clustering method. Two dimensionality reduction methods (principal component analysis (PCA), single-cell Variational Inference (scVI) autoencoder) and four classification models (logistic regression, random forest, support vector machine (SVM)) were implemented and performance metrics (F1 score, precision, and recall) were compared over 10 bootstrap samples. Results: ATAC-seq features improved annotation quality and prediction confidence when using scVI embeddings, independent of the classifier. The best-performing model (SVM with scVI embeddings) showed an increase from a median macro F1 score of 0.907 (IQR = [0.902, 0.910]) using scRNA-seq alone to 0.946 (IQR = [0.940, 0.949], p <0.05) with ATAC-seq added. For PCA embeddings, improvements in macro F1 score were insignificant. All cell types (B, T, monocytes, natural killer and dendritic cells) showed significant improvements when using ATAC-seq with scVI embeddings. CD4 T effector memory cells showed the largest gain in F1 score (0.112, p <0.01), whilst type-2 conventional dendritic cells showed the smallest improvement (0.006, p <0.05). Prediction confidence was improved in B cells, monocytes, natural killer cells, CD4 and CD8 naïve cells, CD4 T central memory cells and CD8 T effector memory cells. Improvements in F1 scores were lost when only classifying major cell types rather than subtypes. Conclusions: Employing ATAC-seq embeddings with scVI autoencoder enhances supervised annotation quality over scRNA-only methods. Further studies should explore the use of ATAC to improve the annotation of highly heterogeneous tissues such as tumours. Citation Format: Jaidip Gill, Abhijit Dasgupta, Brychan Manry, Natasha Markuzon. Combining single-cell ATAC and RNA sequencing for supervised cell annotation [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2024; Part 1 (Regular Abstracts); 2024 Apr 5-10; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2024;84(6_Suppl):Abstract nr 4927.
Read full abstract