Abstract

BackgroundA vital step in analyzing single-cell data is ascertaining which cell types are present in a dataset, and at what abundance. In many diseases, the proportions of varying cell types can have important implications for health and prognosis. Most approaches for cell type annotation have centered around cell typing for single-cell RNA-sequencing (scRNA-seq) and have had promising success. However, reliable methods are lacking for many other single-cell modalities such as single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq), which quantifies the extent to which genes of interest in each cell are epigenetically “open” for expression.ResultsTo leverage the informative potential of scATAC-seq data, we developed CAMML with the integration of chromatin accessibility (CAraCAl), a bioinformatic method that performs cell typing on scATAC-seq data. CAraCAl performs cell typing by scoring each cell for its enrichment of cell type-specific gene sets. These gene sets are composed of the most upregulated or downregulated genes present in each cell type according to projected gene activity.ConclusionsWe found that CAraCAl does not improve performance beyond CAMML when scRNA-seq is present, but if only scATAC-seq is available, CAraCAl performs cell typing relatively successfully. As such, we also discuss best practices for cell typing and the strengths and weaknesses of various cell annotation options.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.