Abstract

The development of ribosomes that can read quadruplet codons could trigger a giant leap in the complexity of protein sequences. Although the practical exploration of sequence space is still limited to an infinitesimal fraction of the total volume, a full quadruplet genetic code would essentially double the information-theoretic content of proteins. Analogous studies modifying the alphabet size of ribozymes suggest that increasing the information-theoretic content of the genetic code could permit a corresponding increase in functionality. Recent work has overcome major inefficiencies in the translation of programmable quadruplet codons, paving the way for studies on fundamental questions about the origin of the genetic code and the characteristics of alternate protein “universes”. Any protein can be thought of as occupying a specific point in sequence space, the multidimensional, discrete space in which each axis corresponds to the amino acid at a particular site in the protein [1]. The size of sequence space is limited by the number of letters in the alphabet and the length of the protein. The standard genetic code comprises 20 types of amino acid, so the number of theoretically possible proteins is 20L (where L is the number of amino acid residues in the protein sequence) although selenocysteine and pyrrolysine can increase this alphabet to 22 letters. A substantial body of work has further expanded the alphabet to include a large range of non-biological functional groups, incorporating around 70 different unnatural amino acids [2]. Like selenocysteine and pyrrolysine, artificial expansion of the genetic code usually involves translating an amber stop codon using a tRNA aminoacylated by the unnatural amino acid. However, a larger number of “blank” codons would be required to enable the incorporation of multiple different unnatural amino acids into the same protein. Therefore, quadruplet (i.e. frameshift) codons and the corresponding aminoacylated tRNA have also been explored [3–5]. In principle, a quadruplet genetic code could permit a huge increase in the volume of accessible sequence space. However, prior efforts using quadruplet or amber codons were ultimately limited by competition with native tRNAs, which are readily accepted by the native ribosome, and quadruplet codons were particularly hampered by low efficiency. Furthermore, uncontrolled incorporation of the unnatural amino acid caused widespread changes to the proteome in vivo. To circumvent these problems, Rackham and Chin designed “orthogonal” ribosomes that recognize an altered ribosome-binding site (RBS), thereby specifying that only mRNAs containing the mutant RBS would be translated by orthogonal ribosomes [6]. These ribosomes were still inefficient at incorporating unnatural amino acids, but Wang et al. [7] evolved them to reduce premature termination (ribo-X). This procedure produced orthogonal ribosomes with increased amber suppression on the desired mRNA, while native ribosomes maintained the regular level of amber suppression. In recent work, Neumann et al. [8] evolved ribo-X even further to enhance its efficiency for translation of quadruplet codons. These ribosomes, termed ribo-Q, utilized quadruplet codons with similar efficiency and fidelity as triplets. A protein containing an azide and an alkyne was produced efficiently using a quadruplet codon and amber suppression on the orthogonal mRNA, allowing formation of an internal cross-link. In principle, ribo-Q (and perhaps its descendents) might enable even more ambitious alterations to proteins.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call