Avoiding Catastrophic Mutations Accurately Predicts Amino Acid to Codon Pairing.

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

DNA codon mutations involving Stop signals or the amino acid cysteine can be especially damaging. The former can break protein sequences or add extraneous amino acids. The latter can add or subtract disulfide bonds crucial in protein folding. We present a hypothetical scenario where Stop codons were present early in the evolution of the genetic code; and minimizing catastrophic mutations for code networks affected all subsequent amino acid/codon pairings. Predicted features of this "Catastrophic Mutation Minimization Hypothesis" (CMMH) are that: (1) Cysteine is mutationally adjacent to Stop, isolating a contiguous codon 'neighborhood' with high potential for catastrophe. (2) The sequence of amino acid additions order determines codon assignments through minimizing network-wide mutation costs. Overall, codon locations for 16 of the 20 amino acids in the genetic code are consistent with the CMMH, as are multiple other predictions. We propose an antecedent genetic code consisted of 16 doublet codons specifying 13-14 amino acids. Two variations of these networks are less susceptible to catastrophic mutations than 88.2-97.5% of randomly generated ones. Unlike some previous hypotheses, CMMH does not require the total replacement or rearrangement of amino acids at codons, with its disruptive potential for protein synthesis. Finally, the composition of this ancestral doublet genetic code has all the modern code's utility: amino acids from four chemical types; start and stop signals; metal-binding ability; disulfide bridging for creating protein shapes; and possible epigenetic gene regulation. Thus, the modern code likely evolutionarily fine-tuned antecedent capabilities, rather than significantly increasing competence for making complex proteins.

Similar Papers
  • Research Article
  • Cite Count Icon 29
  • 10.1007/s00239-013-9567-y
Evolution of the Genetic Code by Incorporation of Amino Acids that Improved or Changed Protein Function
  • Oct 1, 2013
  • Journal of Molecular Evolution
  • Brian R Francis

Fifty years have passed since the genetic code was deciphered, but how the genetic code came into being has not been satisfactorily addressed. It is now widely accepted that the earliest genetic code did not encode all 20 amino acids found in the universal genetic code as some amino acids have complex biosynthetic pathways and likely were not available from the environment. Therefore, the genetic code evolved as pathways for synthesis of new amino acids became available. One hypothesis proposes that early in the evolution of the genetic code four amino acids-valine, alanine, aspartic acid, and glycine-were coded by GNC codons (N=any base) with the remaining codons being nonsense codons. The other sixteen amino acids were subsequently added to the genetic code by changing nonsense codons into sense codons for these amino acids. Improvement in protein function is presumed to be the driving force behind the evolution of the code, but how improved function was achieved by adding amino acids has not been examined. Based on an analysis of amino acid function in proteins, an evolutionary mechanism for expansion of the genetic code is described in which individual coded amino acids were replaced by new amino acids that used nonsense codons differing by one base change from the sense codons previously used. The improved or altered protein function afforded by the changes in amino acid function provided the selective advantage underlying the expansion of the genetic code. Analysis of amino acid properties and functions explains why amino acids are found in their respective positions in the genetic code.

  • Research Article
  • Cite Count Icon 40
  • 10.1006/jtbi.1994.1198
The Origin and Evolution of the Genetic Code
  • Oct 1, 1994
  • Journal of Theoretical Biology
  • Pierre Béland + 1 more

The Origin and Evolution of the Genetic Code

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 43
  • 10.1074/jbc.m113.467977
Genetic Code-guided Protein Synthesis and Folding in Escherichia coli
  • Oct 1, 2013
  • Journal of Biological Chemistry
  • Shaoliang Hu + 3 more

Universal genetic codes are degenerated with 61 codons specifying 20 amino acids, thus creating synonymous codons for a single amino acid. Synonymous codons have been shown to affect protein properties in a given organism. To address this issue and explore how Escherichia coli selects its "codon-preferred" DNA template(s) for synthesis of proteins with required properties, we have designed synonymous codon libraries based on an antibody (scFv) sequence and carried out bacterial expression and screening for variants with altered properties. As a result, 342 codon variants have been identified, differing significantly in protein solubility and functionality while retaining the identical original amino acid sequence. The soluble expression level varied from completely insoluble aggregates to a soluble yield of ~2.5 mg/liter, whereas the antigen-binding activity changed from no binding at all to a binding affinity of > 10(-8) m. Not only does our work demonstrate the involvement of genetic codes in regulating protein synthesis and folding but it also provides a novel screening strategy for producing improved proteins without the need to substitute amino acids.

  • Research Article
  • Cite Count Icon 9
  • 10.1016/s0303-2647(99)00056-8
Evolution of the genetic code: the nonsense, antisense, and antinonsense codes make no sense
  • Dec 1, 1999
  • Biosystems
  • G Houen

Evolution of the genetic code: the nonsense, antisense, and antinonsense codes make no sense

  • Book Chapter
  • 10.1002/9780470015902.a0000809.pub2
Genetic Code: Introduction
  • Oct 17, 2011
  • Kimitsuna Watanabe + 1 more

The genetic code establishes the relationship between all 64 possible arrangements of triplets (codons) of the four nucleotide bases contained in either DNA (A, T, G and C) or RNA (A, U, G and C) and the 20 amino acids that are used to construct proteins via ‘translation’ system as well as signals of translation initiation and termination. The historical events from 1950s to 1960s that contributed to the deciphering of the genetic code led to the development of the field of molecular biology. In 1960s, the genetic code was established to be ‘universal’ for all living organisms. However, from late 1970s, variations of genetic code have been found in various genetic systems. Variations of genetic code promote studies on the origin and evolution of the genetic code. Key Concepts: Genetic code is the correlation between nucleotide triplet and the corresponding amino acid. Genetic code consists of 64 triplet codons and sometimes ‘degenerate’. Central dogma states that genetic information flows unidirectionally from DNA to protein via RNA as an intermediary. A cell‐free protein system contains ribosome, S100 fraction, tRNA, amino acids, ATP, an energy recycling system and a template. Termination codon consists of UAG (amber codon), UAA (ochre codon) and UGA (opal codon), which code for no amino acid but instead cause protein synthesis to terminate. Initiation codon is AUG, which initiates protein synthesis with formyl‐methionine in bacteria and phage. Universal genetic code shows the way of assignment of 64 triplet codons to each of 20 amino acids and 3 termination codons, which is common to almost all extant organisms – bacteria, yeasts, viruses, plants and animals.

  • Research Article
  • Cite Count Icon 2
  • 10.5802/crbiol.47
On the origin of the genetic code: a 27-codon hypothetical precursor of an intricate 64-codon intermediate shaped the modern code.
  • Apr 21, 2021
  • Comptes rendus biologies
  • Bernard Dujon

The modern genetic code reveals numerous traces of specific relationships between the early codons which, together with its internal asymmetries, suggest a sequential appearance of the nucleobases in primitive RNA molecules. Keeping the hypothesis of triplet pairings between primitive RNA molecules at the origin of the code, this work systematically examines complete codon-anticodon interaction matrices assuming distinct pairing options at each position of the triplet duplexes. Application of these principles suggests that a 27-codon precursor having a reasonable coding capacity for short peptide synthesis could have started with primitive RNA molecules able to form two distinct pairs with different free energies between a single purine and two pyrimidines (such as G with C and U). Conservation of the same pairing options at positions 1 and 2 of codons at the arrival of a second purine with distinct pairing preferences (such as A) generated a 64-codon intermediate code made of interrelated pairs or groups of codons (designated here as intricacy). The numerous traces of this hypothetical scheme that are visible in the standard and variant forms of the modern code demonstrate without ambiguity that the ancestral codon-anticodon duplexes required high energetic pairings at their central position (Watson-Crick) but tolerated less energetic pairings at the first codon position (G • U type). Combined with the sequential appearance of the nucleobases, the predicted codon intricacy allows a stepwise reconstruction of the evolution of the coding repertoire, by simple a posteriori comparison to the modern code. This reconstruction reveals a remarkable internal coherence in terms of amino acids and tRNA synthetases recruitment. The code started with a group of amino acids (Ala, Gly, Pro, Ser and Thr) that are now all activated by class II tRNA synthetases before reaching an intermediate period during which up to 14 distinct amino acids could be encoded by a full set of intricated codons. The perfect coincidence between the last 6 amino acids predicted in this reconstruction and the speculated action of the arrival of free atmospheric oxygen on proteins is spectacular, and suggests that the code has only reached its present form after the great oxidation event.

  • Research Article
  • Cite Count Icon 20
  • 10.1134/s0006297921080083
A Code Within a Code: How Codons Fine-Tune Protein Folding in the Cell.
  • Aug 1, 2021
  • Biochemistry (Moscow)
  • Anton A Komar

The genetic code sets the correspondence between the sequence of a given nucleotide triplet in an mRNA molecule, called a codon, and the amino acid that is added to the growing polypeptide chain during protein synthesis. With four bases (A, G, U, and C), there are 64 possible triplet codons: 61 sense codons (encoding amino acids) and 3 nonsense codons (so-called, stop codons that define termination of translation). In most organisms, there are 20 common/standard amino acids used in protein synthesis; thus, the genetic code is redundant with most amino acids (with the exception of Met and Trp) are being encoded by more than one (synonymous) codon. Synonymous codons were initially presumed to have entirely equivalent functions, however, the finding that synonymous codons are not present at equal frequencies in mRNA suggested that the specific codon choice might have functional implications beyond coding for amino acid. Observation of nonequivalent use of codons in mRNAs implied a possibility of the existence of auxiliary information in the genetic code. Indeed, it has been found that genetic code contains several layers of such additional information and that synonymous codons are strategically placed within mRNAs to ensure a particular translation kinetics facilitating and fine-tuning co-translational protein folding in the cell via step-wise/sequential structuring of distinct regions of the polypeptide chain emerging from the ribosome at different points in time. This review summarizes key findings in the field that have identified the role of synonymous codons and their usage in protein folding in the cell.

  • Research Article
  • Cite Count Icon 61
  • 10.1016/j.molcel.2008.03.020
Distinct eRF3 Requirements Suggest Alternate eRF1 Conformations Mediate Peptide Release during Eukaryotic Translation Termination
  • Jun 1, 2008
  • Molecular cell
  • Hua Fan-Minogue + 8 more

Distinct eRF3 Requirements Suggest Alternate eRF1 Conformations Mediate Peptide Release during Eukaryotic Translation Termination

  • Discussion
  • Cite Count Icon 6
  • 10.1073/pnas.0409443101
Mistakes in translation don't translate into termination.
  • Jan 26, 2005
  • Proceedings of the National Academy of Sciences of the United States of America
  • Randall A Hughes + 1 more

The evolution of the genetic code remains one of the greatest mysteries in biology. Since the elucidation of the code in the 1960s many hypotheses have been generated to try to explain the assignment of the 64 codons to the canonical 20 amino acids and punctuation. Perhaps the most famous of these, posited by Francis Crick (1), is the “frozen accident” hypothesis, in which the associations of amino acids with their three base codons evolved haphazardly, were fixed in place as organisms became more complex, and thereafter could change only with great difficulty. In this scenario, many mutations would result in amino acid substitutions that would greatly impair the functionality of proteins. Alternatively, the genetic code may have undergone a period of optimization before fixation, and amino acid substitutions would be more chemically and functionally conservative. Choosing between these (and other) scenarios is extremely difficult, though, because all of biology has evolved for many billions of years in the context of the almost universal code and thus has already been highly optimized for the extant code irrespective of whether there was preoptimization or not. At best, we can examine the nature of the chemical constraints on the current genetic code via perturbation and directed evolution experiments. Although many attempts have been made to alter the genetic code by amino acid replacement (2, 3) and codon reassignment (4, 5), few have looked at the global effects of altering the genetic code on an organism until now. To probe the degree and nature of selective pressures that constrain the genetic code, Bacher et al. (6) in this issue of PNAS explore …

  • Research Article
  • Cite Count Icon 9
  • 10.1016/j.jtbi.2018.09.021
The genetic code is not an optimal code in a model taking into account both the biosynthetic relationships between amino acids and their physicochemical properties
  • Sep 20, 2018
  • Journal of Theoretical Biology
  • Angelo Facchiano + 1 more

The genetic code is not an optimal code in a model taking into account both the biosynthetic relationships between amino acids and their physicochemical properties

  • Research Article
  • Cite Count Icon 25
  • 10.1016/j.jtbi.2019.01.022
Evolution of the genetic code based on conservative changes of codons, amino acids, and aminoacyl tRNA synthetases
  • Jan 15, 2019
  • Journal of Theoretical Biology
  • Scott O Rogers

Evolution of the genetic code based on conservative changes of codons, amino acids, and aminoacyl tRNA synthetases

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 23
  • 10.1093/nar/gkad1160
Enzymic recognition of amino acids drove the evolution of primordial genetic codes.
  • Dec 4, 2023
  • Nucleic acids research
  • Jordan Douglas + 3 more

How genetic information gained its exquisite control over chemical processes needed to build living cells remains an enigma. Today, the aminoacyl-tRNA synthetases (AARS) execute the genetic codes in all living systems. But how did the AARS that emerged over three billion years ago as low-specificity, protozymic forms then spawn the full range of highly-specific enzymes that distinguish between 22 diverse amino acids? A phylogenetic reconstruction of extant AARS genes, enhanced by analysing modular acquisitions, reveals six AARS with distinct bacterial, archaeal, eukaryotic, or organellar clades, resulting in a total of 36 families of AARS catalytic domains. Small structural modules that differentiate one AARS family from another played pivotal roles in discriminating between amino acid side chains, thereby expanding the genetic code and refining its precision. The resulting model shows a tendency for less elaborate enzymes, with simpler catalytic domains, to activate amino acids that were not synthesised until later in the evolution of the code. The most probable evolutionary route for an emergent amino acid type to establish a place in the code was by recruiting older, less specific AARS, rather than adapting contemporary lineages. This process, retrofunctionalisation, differs from previously described mechanisms through which amino acids would enter the code.

  • Discussion
  • Cite Count Icon 1
  • 10.1016/j.cub.2004.01.041
Genetic code
  • Feb 1, 2004
  • Current Biology
  • Andre R.O Cavalcanti + 1 more

Genetic code

  • Research Article
  • Cite Count Icon 26
  • 10.1002/bies.201600213
Reassigning stop codons via translation termination: How a few eukaryotes broke the dogma.
  • Dec 23, 2016
  • BioEssays
  • Elena Alkalaeva + 1 more

The genetic code determines how amino acids are encoded within mRNA. It is universal among the vast majority of organisms, although several exceptions are known. Variant genetic codes are found in ciliates, mitochondria, and numerous other organisms. All revealed genetic codes (standard and variant) have at least one codon encoding a translation stop signal. However, recently two new genetic codes with a reassignment of all three stop codons were revealed in studies examining the protozoa transcriptomes. Here, we discuss this finding and the recent studies of variant genetic codes in eukaryotes. We consider the possible molecular mechanisms allowing the use of certain codons as sense and stop signals simultaneously. The results obtained by studying these amazing organisms represent a new and exciting insight into the mechanism of stop codon decoding in eukaryotes. Also see the video abstract here.

  • Book Chapter
  • Cite Count Icon 5
  • 10.1007/978-3-319-91092-5_6
P-Adic Side of the Genetic Code and the Genome
  • Jan 1, 2018
  • Branko Dragovich + 1 more

The genetic code is a mapping from the set of 64 codons onto the set of 20 amino acids and one stop signal. The codons are ordered triplets composed of the nucleotides cytosine (C), adenine (A), uracil (U) (or thymine (T)), guanine (G) and they are contained in the genes. The amino acids are building blocks of the proteins. The vertebrate mitochondrial code is rather simple and the other genetic codes can be considered as its slight modifications. In the vertebrate mitochondrial code, an amino acid is coded by one, two or three codon doublets. When two codons code the same amino acid, one can say that they are close in the informational sense. We show that the p-adic distance is an adequate mathematical instrument for description of the informational codon closeness (nearness, similarity). We show that the set of codons and the set of amino acids are p-adic ultrametric spaces and that the vertebrate mitochondrial code is an ultrametric network. A p-adic approach to possible evolution of the genetic code is presented. We also demonstrate that p-adic closeness between codons, and between nucleotides, is also useful in the investigation of informational closeness between sequences in the genome.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.