Abstract

The DNA sequence preferences of nucleosomes can be captured in a probabilistic model that ascribes a probability or, equivalently, an energy to any 147-bp sequence. This is generally done using a Markov chain model, where the total probability of a sequence is calculated from the probability distributions of sub-sequences of some length (usually dinucleotides) under the assumption that long-distance correlations are less important. These probability distributions must be trained on some subset of the entire sequence space, which has in the past been done experimentally, with certain statistical limitations and caveats. Using a novel simulation method, we are now able to produce large ensembles for training in silico, derived from an underlying energetic nucleosome model. This allows us for the first time to quantitatively examine how well such short-range probabilistic models approximate reality. In a next step, a probabilistic model trained on a quality sequence ensemble can then be used to predict energy landscapes for DNA sequences with negligible computational cost. This will allow us to test the idea that DNA may have evolved mechanical signals for nucleosome positioning, by simulating such evolution and comparing the patterns we find to patterns found in real genomes.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.