Abstract

Development of mathematical methods for study of symbolical sequence periodicity gets special significance nowadays. First of all it is concerned with the successful determination of DNA sequences from various genomes and accumulation of a great number of amino acid sequences. Therefore there is a problem for mathematics and biologists to be solved to determine the structural features of these sequences and to find the biological meaning of the revealed structural features of the sequences. One of such structural features is a periodicity of symbolic sequences. Earlier comprehensive mathematical methods were developed for study of periodicity of continuous and discrete numerical sequences, using Fourier transformation and allowing to define the spectral density of a numerical sequence. However, such application of Fourier transformation demands presentation of a symbolical sequence as a numerical sequence in which the properties of any symbolical text should be displayed unequivocally. The most widely used is the method, including construction from the given symbolical sequence ofm sequences consisting of numbers zero and one, and formed according to the law: x(i, j) = 1, if the symbol ai occupies a site j, and x(i, j) = 0 in all other cases. Here A = {a1, a2, . . . , am} is the alphabet of a symbolical sequence and m is the size of the alphabet of a symbolical sequence. Then the Fourier transformation is applied to each of such numerical sequence and the Fourier-harmonics are calculated, corresponding to i-type symbols, as well as matrix structural factors, corresponding to pair correlation of symbols [6]. However, in our opinion the given method works rather well for study of periodicity of symbolical sequences with relatively short length (which is smaller than the size of the symbolical sequence alphabet). For the periods with the length greater than the size of the symbolical sequence alphabet, there is a possibility of “decomposition” of the statistical importance of the longer periods in favor of the shorter ones. Thus it turns out that statistical importance of the longer period is a kind of “spread” onto the statistical importance of the shorter periods, i.e. there is an effect of attenuation of harmonics with longer periods in favor of harmonics with shorter periods. This effect will be even stronger for cases, where there are several replacements in periodic sequences, in such sequences periods could not be simply identical. The main purpose of this work is to show our results for study of the DNA sequences by ID method and to show the existence of the latent periodicity in lot of gene DNA sequences.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call