Abstract
Capture Hi-C (CHi-C) is a method for profiling chromosomal interactions involving targeted regions of interest, such as gene promoters, globally and at high resolution. Signal detection in CHi-C data involves a number of statistical challenges that are not observed when using other Hi-C-like techniques. We present a background model and algorithms for normalisation and multiple testing that are specifically adapted to CHi-C experiments. We implement these procedures in CHiCAGO (http://regulatorygenomicsgroup.org/chicago), an open-source package for robust interaction detection in CHi-C. We validate CHiCAGO by showing that promoter-interacting regions detected with this method are enriched for regulatory features and disease-associated SNPs.Electronic supplementary materialThe online version of this article (doi:10.1186/s13059-016-0992-2) contains supplementary material, which is available to authorized users.
Highlights
Chromosome conformation capture (3C) technology has revolutionised the analysis of nuclear organisation, leading to important insights into gene regulation [1]
We demonstrate the efficacy of CHiCAGO on two datasets: one from the human lymphoblastoid cell line GM12878 [3] and another from mouse embryonic stem cells [4]
It is generally accepted that this effect reflects the reduction in the frequency of random collisions between genomic fragments owing to constrained Brownian motion of chromatin, in a manner consistent with molecular dynamics simulations [18]
Summary
Chromosome conformation capture (3C) technology has revolutionised the analysis of nuclear organisation, leading to important insights into gene regulation [1]. CHi-C designs such as Promoter CHi-C and HiCap [3,4,5, 11] involve large numbers (many thousands) of spatially dispersed baits This presents the opportunity to increase the robustness of signal detection by sharing information across baits. The bait-specific factors reflect the technical biases of both Hi-C and sequence capture, as well as local effects such as chromatin accessibility We estimate these factors in a way that is robust to the presence of a small fraction of interactions in the data. Estimating other end-specific bias factors poses a challenge, as the majority of interactions are removed at the capture stage that enriches for only a small subset of interactions with baits. The dispersion, r, is estimated using standard maximum likelihood methods
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.