Abstract

The advances in various omics technologies enable quantification of various biological molecules in a high-throughput manner, and thus allow us to integrate multiple layers of information for comprehensive understanding of biological processes or human diseases. Among them, the assay for transposase-accessible chromatin with high throughput sequencing (ATAC-seq) is recently developed for the efficient detection of open chromatin regions in a genome, while RNA-seq is widely used for the measurement of whole genomic gene/transcript expressions. To date, there are few studies of using ATAC-seq on multiple clinical samples, not even to mention integrating ATAC-seq and RNA-seq data from the same human donors. In this study, we generated paired-end ATAC-seq and RNA-seq data in CD4+CD45RO+CD196+ T cells from 32 pairs of human donors that were treated in short-term, ex vivo cultures with T cell activation beads, IL-1B and IL-23 with or without PGE2. We analyzed the ATAC-seq data with a pipeline based on sequence aligner Bowtie2, peak caller MACS2, as well as peak annotation and motif discovery tool Homer, and used Samtools to obtain read counts in peak regions as signals of chromatin accessibility for all samples. We also analyzed the RNA-seq data with a pipeline based on the tools STAR and Cufflinks, and counted read number for each gene in each sample with featureCounts to get gene expression profiles. Next, we separately analyzed differentially accessible chromatin regions and differentially expressed genes between the two treatments with DESeq2, and extracted the overlapped differential genes from the two data results. Interestingly, the fold-changes of these genes were also significantly associated. Therefore, we further integrated the two kinds of data with a linear mixture-effect model, which regressed gene expressions against chromatin accessibilities near the same genes with additional adjustments for the different treatments and a random effect of the paired sample information, to identify chromatin regions that were significantly associated with the nearest gene expressions. We also discovered some motifs and IBD related SNPs in the differentially accessible chromatin regions, which might indicate the regulation of differential gene expressions. In addition, we performed a functional analysis on the differential genes and identified enriched pathways, such as cytokine-cytokine receptor interaction, Jak-STAT signaling and inflammatory bowel disease pathways. In summary, we developed pipelines and approaches for the integrative analysis of ATAC-seq and RNA-seq data, and applied it on 32 pairs of human samples. We identified widespread significant differences of chromatin accessibility and gene expression between two treatments, as well as many regulatory associations between the two kinds of data. Our results demonstrated the feasibility and usefulness of integrating chromatin accessibility and gene expression data in the study of regulatory mechanisms of complex diseases.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call