ChIP-R: Assembling reproducible sets of ChIP-seq and ATAC-seq peaks from multiple replicates

Rhys Newell,Richard Pienaar,Brad Balderson,Michael Piper,Alexandra Essebier,Mikael Bodén

doi:10.1016/j.ygeno.2021.04.026

Rhys Newell, Richard Pienaar + Show 4 more

Open Access

https://doi.org/10.1016/j.ygeno.2021.04.026

Copy DOI

Journal: Genomics	Publication Date: Apr 18, 2021
Citations: 31	License type: publisher-specific-oa

Affiliation: University of Queensland

Abstract

Chromatin immunoprecipitation followed by sequencing (ChIP-seq) is the primary protocol for detecting genome-wide DNA-protein interactions, and therefore a key tool for understanding transcriptional regulation. A number of factors, including low specificity of antibody and cellular heterogeneity of sample, may cause “peak” callers to output noise and experimental artefacts. Statistically combining multiple experimental replicates from the same condition could significantly enhance our ability to distinguish actual transcription factor binding events, even when peak caller accuracy and consistency of detection are compromised.We adapted the rank-product test to statistically evaluate the reproducibility from any number of ChIP-seq experimental replicates. We demonstrate over a number of benchmarks that our adaptation “ChIP-R" (pronounced ‘chipper’) performs as well as or better than comparable approaches on recovering transcription factor binding sites in ChIP-seq peak data. We also show ChIP-R extends to evaluate ATAC-seq peaks, finding reproducible peak sets even at low sequencing depth. ChIP-R decomposes peaks across replicates into “fragments” which either form part of a peak in a replicate, or not. We show that by re-analysing existing data sets, ChIP-R reconstructs reproducible peaks from fragments with enhanced biological enrichment relative to current strategies.

Full Text