Abstract

Accessible chromatin is a highly informative structural feature for identifying regulatory elements, which provides a large amount of information about transcriptional activity and gene regulatory mechanisms. Human ATAC-seq datasets are accumulating rapidly, prompting an urgent need to comprehensively collect and effectively process these data. We developed a comprehensive human chromatin accessibility database (ATACdb, http://www.licpathway.net/ATACdb), with the aim of providing a large amount of publicly available resources on human chromatin accessibility data, and to annotate and illustrate potential roles in a tissue/cell type-specific manner. The current version of ATACdb documented a total of 52 078 883 regions from over 1400 ATAC-seq samples. These samples have been manually curated from over 2200 chromatin accessibility samples from NCBI GEO/SRA. To make these datasets more accessible to the research community, ATACdb provides a quality assurance process including four quality control (QC) metrics. ATACdb provides detailed (epi)genetic annotations in chromatin accessibility regions, including super-enhancers, typical enhancers, transcription factors (TFs), common single-nucleotide polymorphisms (SNPs), risk SNPs, eQTLs, LD SNPs, methylations, chromatin interactions and TADs. Especially, ATACdb provides accurate inference of TF footprints within chromatin accessibility regions. ATACdb is a powerful platform that provides the most comprehensive accessible chromatin data, QC, TF footprint and various other annotations.

Highlights

  • Genome-wide identification of chromatin accessibility is important for detecting regulatory elements and understanding transcriptional regulation governing biological processes such as cell fate determination, cell differentiation and diseases development [1,2]

  • Several databases store chromatin accessibility data based on DNase-seq datasets, including GTRD [21], EpiRegio [22], DeepBlue [23] and OCHROdb [24]

  • Peak relative to TSS distribution Accessible chromatin region single-nucleotide polymorphisms (SNPs) Common SNP Risk SNP Super-enhancer Enhancer transcription factor binding sites (TFBSs) conserved TAD Differential-Overlapping-Region analysis g Overlapping accessible chromatin regions bound by two transcription factors (TFs) analysis h Simple information browse Browse based on samples classification i Region statistics for each sample Alphanumerically sortable table

Read more

Summary

INTRODUCTION

Genome-wide identification of chromatin accessibility is important for detecting regulatory elements and understanding transcriptional regulation governing biological processes such as cell fate determination, cell differentiation and diseases development [1,2]. These studies confirmed the significance of chromatin accessibility in addressing key issues associated with biological processes, cell differentiation, cancer biology and disease development. Several databases store chromatin accessibility data based on DNase-seq datasets, including GTRD [21], EpiRegio [22], DeepBlue [23] and OCHROdb [24]. ATACdb is a user-friendly database to query, browse and visualize information associated with chromatin accessibility regions

MATERIALS AND METHODS
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call