Abstract

ATAC-seq has become a leading technology for probing the chromatin landscape of single and aggregated cells. Distilling functional regions from ATAC-seq presents diverse analysis challenges. Methods commonly used to analyze chromatin accessibility datasets are adapted from algorithms designed to process different experimental technologies, disregarding the statistical and biological differences intrinsic to the ATAC-seq technology. Here, we present a Bayesian statistical approach that uses latent space models to better model accessible regions, termed ChromA. ChromA annotates chromatin landscape by integrating information from replicates, producing a consensus de-noised annotation of chromatin accessibility. ChromA can analyze single cell ATAC-seq data, correcting many biases generated by the sparse sampling inherent in single cell technologies. We validate ChromA on multiple technologies and biological systems, including mouse and human immune cells, establishing ChromA as a top performing general platform for mapping the chromatin landscape in different cellular populations from diverse experimental designs.

Highlights

  • ATAC-seq has become a leading technology for probing the chromatin landscape of single and aggregated cells

  • We show that the method is readily adaptable to different experimental designs and technologies

  • To improve upon the duration behavior of standard hidden Markov models (HMM)[19], we model the duration (d) of each accessible region through an hidden semi-Markov models (HSMM) that exhibits a flexible negative binomial (NB) duration distribution[20] as follows

Read more

Summary

Introduction

ATAC-seq has become a leading technology for probing the chromatin landscape of single and aggregated cells. Low starting material techniques to probe the methylome landscape and different chromatin features have evolved from bulk assays to the single-cell domain[11,12,13] These techniques raise the possibility of both describing the variability of chromatin accessibility, methylation states and chromatin fragments, and enable the study of epigenomic heterogeneity by classifying cellular types based on their chromatin structure[13,14,15]. We use Th17 bulk[17,18], A20 and GM12878 single-cell data sets (the Data availability section), identifying accessible chromatin and establishing ChromA as an effective platform for mapping the chromatin landscape in different cellular populations. We show that the method is readily adaptable to different experimental designs and technologies

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call