A theory of an early stage of genome evolution by combinatorial fusion of circular DNA units is suggested, based on protein sequence "fossil" evidence. The evidence includes preference of protein sequence lengths for certain sizes--multiples of 123 aa for eukaryotes and multiples of 152 aa for prokaryotes. At the DNA level these sizes correspond to 350-450 base pairs--the known optimal range for DNA ring closure. The methionine residues repeatedly appear along the sequences with the same period of about 120 aa (in eukaryotes), presumably marking the sites of insertion of the early genes--rings of protein-coding DNA. No torsional constraint in this DNA results in very sharp estimate of the helical periodicity of the early DNA, indistinguishable from the experimental mean value for extant DNA. According to the combinatorial fusion theory, based on the above evidence, in the pregenomic, prerecombinational stage the genes and the noncoding sequences existed in form of autonomously replicating DNA rings of close to standard size, randomly segregating between dividing cells, like modern plasmids do. In the recombinational early genomic stage the rings started to fuse, forming larger DNA molecules consisting of several unit genes connected in various combinations and forming long protein-coding sequences (combinatorial fusion). This process, which involved, perhaps, noncoding sequences as well, eventually resulted in the formation of large genomes. The dispersed circular DNA--or, rather, evolutionarily advanced derivatives thereof--may still exist in the form of various mobile DNA elements.
Read full abstract