Abstract
A visualization scheme of avoided and under-represented strings in complete genomes reveals some nice patterns which become precise fractals in the non-biological limit of infinitely long strings. There arises the problem of calculating the dimensions of these fractals with self-similar and self-overlapping structures. Direct counting of avoided strings in a complete genome raises the problem of distinguishing true and redundant avoided strings of a given length knowing the number of shorter ones. The two problems turn out to be one and are solved exactly by using two different methods: combinatorics and language theory.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have