Phylogenetic trees requiring the lowest sum of nucleotide replacements and gene duplicative events were constructed from the amino acid sequence data on ten gnathostome parvalbumins (PAR) and two related myofibrillar proteins troponin-C (TNC) and myosin alkali-light-chain (ALC). The origin and differentiation of the structural domains within these proteins were also investigated by the maximum parsimony method and by an alignment statistic for identifying evolutionarily related protein sequences. The results suggest, in agreement with the Weeds-McLachlan model, that tandem duplications in a precursor gene caused a primordial one-domain polypeptide (consisting of two helices with a calcium binding region in between) to double and then quadruple in size. Duplications of the gene coding for this four domain (I–II–III–IV) protein in an early metazoan, pre-gnathostome lineage gave rise to the separate loci for TNC, ALC, and PAR. TNC, which alone retained the Ca-binding function in each of its four domains, evolved much more slowly than either the ALC or PAR lineages. In the PAR lineage the I–II–III–IV structure was degraded, presumably by a partial gene deletion, to the II–III–IV structure during descent to the gnathostome ancestor of parvalbumins. Also during this period the mid region in domain II lost its Ca-binding function and, as it did so, evolved at an accelerated rate over other regions, a pattern indicative of positive selection for a change in function. In turn, from the gnathostome ancestor to the present, the mid regions of domains III and IV, which each retained Ca-bindung function, evolved much more slowly than other regions, a pattern indicative of stabilizing selection for preservation of function. Between the gnathostome and teleost-tetrapod ancestor a gene duplication separated the parvalbumins into anα-lineage and aβ-lineage. During this early vertebrate period PAR genes evolved at the extremely fast rate of 89 nucleotide replacements per 100 codons per 108 years (i.e. 89 NR %), but from the teleost-tetrapod ancestor to the present, bothα- andβ-PAR lineages evolved at a much slower rate, about 8 NR %. The use ofβ-parvalbumins as phylogenetic markers was complicated by presumptive evidence that paralogous (i.e. duplication dependent) gene lineages occur within this group. As a final point, in the genealogy of TNC, ALC, and PAR lineages, a non-random pattern of nucleotide replacements was observed between the reconstructed ancestral and descendant mRNA sequences. The pattern was similar to that observed for other protein genealogies and seems to reflect a bias in the genetic code for guanine to adenine and adenine to guanine transitions (especially at the first nucleotide position of the RNA codons) to produce amino acid substitutions which are compatible with the preservation of protein three-dimensional structure.
Read full abstract