Abstract

AbstractAs a revolutionary tool, the Hi-C technology can be used to capture genomic segments that have close spatial proximity in three dimensional space and enable the study of chromosome structures at an unprecedentedly high throughput and resolution. However, during the experimental steps of Hi-C, systematic biases from different sources are often introduced into the resultant data (i.e., reads or read counts). Several bias reduction methods have been proposed recently. Although both systematic biases and spatial distance are known as key factors determining the number of observed chromatin interactions, the existing bias reduction methods in the literature do not include spatial distance explicitly in their computational models for estimating the interactions. In this work, we propose an improved Poisson regression model and an efficient gradient descent based algorithm, GDNorm, for reducing biases in Hi-C data that takes spatial distance into consideration. GDNorm has been tested on both simulated and real Hi-C data, and its performance compared with that of the state-of-the-art bias reduction methods. The experimental results show that our improved Poisson model is able to provide more accurate normalized contact frequencies (measured in read counts) between interacting genomic segments and thus a more accurate chromosome structure prediction when combined with a chromosome structure determination method such as ChromSDE. Moreover, assessed by recently published data from human lymphoblastoid and mouse embryonic stem cell lines, GDNorm achieves the highest reproducibility between the biological replicates of the cell lines. The normalized contact frequencies obtained by GDNorm is well correlated to the spatial distance measured by florescent in situ hybridization (FISH) experiments. In addition to accurate bias reduction, GDNorm has the highest time efficiency on the real data. GDNorm is implemented in C++ and available at http://www.cs.ucr.edu/~yyang027/gdnorm.htmKeywordschromosome conformation captureHi-C datasystematic bias reductionPoisson regressiongradient descent

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call