We investigate correlated binary sequences using an n-tuple Zipf analysis, where we define ``words'' as strings of length n, and calculate the normalized frequency of occurrence \ensuremath{\omega}(R) of ``words'' as a function of the word rank R. We analyze sequences with short-range Markovian correlations, as well as those with long-range correlations generated by three different methods: inverse Fourier transformation, L\'evy walks, and the expansion-modification system. We study the relation between the exponent \ensuremath{\alpha} characterizing long-range correlations and the exponent \ensuremath{\zeta} characterizing power-law behavior in the Zipf plot. We also introduce a function P(\ensuremath{\omega}), the frequency density, which is related to the inverse Zipf function R(\ensuremath{\omega}), and find a simple relationship between \ensuremath{\zeta} and \ensuremath{\psi}, where \ensuremath{\omega}(R)\ensuremath{\sim}${\mathit{R}}^{\mathrm{\ensuremath{-}}\mathrm{\ensuremath{\zeta}}}$ and P(\ensuremath{\omega})\ensuremath{\sim}${\mathrm{\ensuremath{\omega}}}^{\mathrm{\ensuremath{-}}\mathrm{\ensuremath{\psi}}}$. Further, for Markovian sequences, we derive an approximate form for P(\ensuremath{\omega}). Finally, we study the effect of a coarse-graining ``renormalization'' on sequences with Markovian and with long-range correlations.
Read full abstract