Abstract

Gene DNAs of different organisms show a wide variation in their G+C content as much as 20% to 80%. This variation has been regarded as the result of the compliance of the genetic code with the base-composition-deflecting mutational pressure. To make possible a quantitative discussion of this genetic code's elasticity, we made a statistical study of the G+C frequency at the 1st, 2nd, and 3rd positions of codons: 4.5 x 10(6) codons in 11,981 protein coding regions in the DNA data base were analyzed. The data were examined quantitatively by using a species-independent universal equation which describes the base frequencies at the three codon sites in terms of the constraint parameters characteristic of the sites and an intersite interaction. By a best fitting procedure between theoretical curves and data points, the constraint parameters and the characteristic G+C contents to which the 1st and the 2nd site base compositions are bound were determined. The base substituting mutation of the coding sequence under the base-composition-deflecting pressure is divided into following three stages of the different compliance from the elastic one to the rigid: 1) the 3rd position of codons change by synonymous substitution; 2) the 1st and then 2nd positions change accompanying amino acid replacement; and 3) in the organisms exposed under an extremely high base composition deflecting pressure, the codon table is forced to be altered. The compliance parameters were derived quantitatively for the first two stages. In conclusion, a simultaneous analysis of data from organisms as divers as virus and man discovered that there is a set of constraints common to species, which governs the frequency of codon bases, and it can be described by a universal equation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call