Abstract

The distribution of deviations from Chargaff's second parity rule was examined for overlapping sequence windows of a length (1 kb) predicted to be suitable for detecting correlations with functional features of DNA. For long genomic segments from E. coli, Saccharomyces cerevisiae, and Vaccinia virus, Chargaff differences for the W bases and/or for the S bases correlate with transcription direction and gene location. For W-rich genomes, the mRNA-synonymous strand contains regions which, if extruded from negatively supercoiled DNA, would fold to generate stem-loop structures with A-rich loops. Similarly, for S-rich genomes the loops would be G-rich. We suggest that the disposition of genes in nucleic acid sequences arises from their having to adapt to a preexisting mosaic of genomic regions, each distinguished by its potential to extrude single-strand loops enriched for a particular base (or two non-Watson-Crick pairing bases). The mosaic would have facilitated the intrastrand and interstrand accounting required for correction of mutations, and would have evolved in the early RNA world before the emergence of protein-encoding capacity. The preexisting mosaic would have determined transcription direction since there is pressure for all mRNAs of a cell to have purine-rich loops, thus decreasing loop-loop interactions which might lead to formation of "self" sense-antisense RNA duplexes.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call