AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification.

Naozumi Hiranuma,Scott M Lundberg,Su-In Lee

doi:10.1093/nar/gkz156

Naozumi Hiranuma, Scott M Lundberg + Show 1 more

Open Access

https://doi.org/10.1093/nar/gkz156

Copy DOI

Journal: Nucleic acids research	Publication Date: Mar 14, 2019
Citations: 11	License type: CC BY-NC 4.0

Affiliation: University of Washington

Abstract

ChIP-seq is a technique to determine binding locations of transcription factors, which remains a central challenge in molecular biology. Current practice is to use a ‘control’ dataset to remove background signals from a immunoprecipitation (IP) ‘target’ dataset. We introduce the AIControl framework, which eliminates the need to obtain a control dataset and instead identifies binding peaks by estimating the distributions of background signals from many publicly available control ChIP-seq datasets. We thereby avoid the cost of running control experiments while simultaneously increasing the accuracy of binding location identification. Specifically, AIControl can (i) estimate background signals at fine resolution, (ii) systematically weigh the most appropriate control datasets in a data-driven way, (iii) capture sources of potential biases that may be missed by one control dataset and (iv) remove the need for costly and time-consuming control experiments. We applied AIControl to 410 IP datasets in the ENCODE ChIP-seq database, using 440 control datasets from 107 cell types to impute background signal. Without using matched control datasets, AIControl identified peaks that were more enriched for putative binding sites than those identified by other popular peak callers that used a matched control dataset. We also demonstrated that our framework identifies binding sites that recover documented protein interactions more accurately.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification.

Abstract

Talk to us

Similar Papers

More From: Nucleic acids research

Lead the way for us

Similar Papers

Dual threshold optimization and network inference reveal convergent evidence from TF binding locations and TF perturbation responses.
Yiming Kang ... R Scott Mcisaac
Genome Research | VOL. 30
Yiming Kang, et. al.Yiming Kang ... R Scott Mcisaac
14 Feb 2020
Genome Research | VOL. 30

Genome-wide histone acetylation data improve prediction of mammalian transcription factor binding sites
Stephen A Ramsey ... Mark Gilchrist
Bioinformatics | VOL. 26
Stephen A Ramsey, et. al.Stephen A Ramsey ... Mark Gilchrist
27 Jul 2010
Bioinformatics | VOL. 26

Using methylation data to improve transcription factor binding prediction
Daniel Morgan ... Kimberly Glass
Epigenetics | VOL. 19
Daniel Morgan, et. al.Daniel Morgan ... Kimberly Glass
01 Feb 2024
Epigenetics | VOL. 19

Predicting which genes will respond to transcription factor perturbations.
Yiming Kang ... Wooseok J Jung
G3 Genes|Genomes|Genetics | VOL. 12
Yiming Kang, et. al.Yiming Kang ... Wooseok J Jung
06 Jun 2022
G3 Genes|Genomes|Genetics | VOL. 12

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

AIControl: replacing matched control experiments with machine learning improves ChIP-seq peak identification.

Abstract

Talk to us

Similar Papers

More From: Nucleic acids research