Abstract

Enhancers are cis-acting sequences that regulate transcription rates of their target genes in a cell-specific manner and harbor disease-associated sequence variants in cognate cell types. Many complex diseases are associated with enhancer malfunction, necessitating the discovery and study of enhancers from clinical samples. Assay for Transposase Accessible Chromatin (ATAC-seq) technology can interrogate chromatin accessibility from small cell numbers and facilitate studying enhancers in pathologies. However, on average, ~35% of open chromatin regions (OCRs) from ATAC-seq samples map to enhancers. We developed a neural network-based model, Predicting Enhancers from ATAC-Seq data (PEAS), to effectively infer enhancers from clinical ATAC-seq samples by extracting ATAC-seq data features and integrating these with sequence-related features (e.g., GC ratio). PEAS recapitulated ChromHMM-defined enhancers in CD14+ monocytes, CD4+ T cells, GM12878, peripheral blood mononuclear cells, and pancreatic islets. PEAS models trained on these 5 cell types effectively predicted enhancers in four cell types that are not used in model training (EndoC-βH1, naïve CD8+ T, MCF7, and K562 cells). Finally, PEAS inferred individual-specific enhancers from 19 islet ATAC-seq samples and revealed variability in enhancer activity across individuals, including those driven by genetic differences. PEAS is an easy-to-use tool developed to study enhancers in pathologies by taking advantage of the increasing number of clinical epigenomes.

Highlights

  • Enhancers are non-coding cis-regulatory elements that precisely regulate expression patterns of genes controlling cell type-specific functions and developmental fate[1]

  • Among the tools developed by the ENCODE consortium[5], the Hidden Markov Model (HMM)–based ChromHMM algorithm[7] has become an important tool to assess the global epigenomic landscape in human cells by segmenting genome-wide chromatin into a finite number of chromatin states based on combinatorial histone modification marks profiled by ChIP-seq technology

  • We showed that, by integrating data across these five cell types, we can predict enhancers from the ATAC-seq profile of cell types that are not used in model training (EndoC-βH1 beta cell line, naïve CD8+ T cells, and ENCODE cell lines MCF7 and K562), suggesting that Predicting Enhancers from ATAC-Seq data (PEAS) can predict enhancers in cell types that are not profiled by Roadmap/ENCODE consortia

Read more

Summary

Introduction

Enhancers are non-coding cis-regulatory elements that precisely regulate expression patterns of genes controlling cell type-specific functions and developmental fate[1]. Assay for Transposase Accessible Chromatin (ATAC-seq) technology[28,29] revolutionized epgenomic profiling by enabling chromatin accessibility profiling from small cell numbers This technology has been widely utilized to study epigenomes of clinically-relevant human cells under diverse conditions[30,31], including our work to study immunosenescence in blood-derived immune cells[32] and type 2 diabetes (T2D) in pancreatic islets[33]. There is a need to develop computational methods to discriminate OCRs mapping to enhancers from the remaining cis-regulatory elements For this purpose, we developed a machine-learning framework based on neural networks (PEAS: Predicting Enhancers from ATAC-Seq data) to infer enhancers from ATAC-seq profiles (Fig. 1) by extracting and integrating ATAC-seq related data features (e.g., peak length) with sequence related features (e.g., GC%). PEAS was developed using scikit-learn[35] and is accompanied by a user interface developed in Java to enable other researchers to predict enhancers in their ATAC-seq samples (https://github.com/UcarLab/PEAS)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.