Abstract

In the book of Genesis, God gave Adam the power to name all the creatures that he encountered. From the moment that he named the first apple “delicious,” he ran into trouble. It may have looked delicious, but until he actually did the experiment, he had no idea of its real properties. More recently, bioinformatics programs have given us the power to name all the genes whose sequences we encounter. A paper by Martinez-Gomez et al. in this issue (2) shows that inappropriate nomenclature can still get us into trouble. Until we do the experiment, we may have no idea of the real properties of a gene product. Specifically, they show that only the rhodanese domain of the ThiI protein is required for a key thiolation reaction in the synthesis of thiamine. The other two domains (THUMP and AANH) are dispensable. And yet, nearly three-quarters of all genes annotated as thiI by the OMNIOME.pep resource encode only the first two domains and lack the rhodanese domain entirely. Worse yet, Pfam, another of the more common resources for information about protein domains, calls the AANH domain a “ThiI domain” and fails to identify a rhodanese domain in either the Escherichia coli or Salmonella enterica ThiI protein. Thus, proteins with an AANH domain but no rhodanese domain are likely to be annotated as ThiI and those with a rhodanese domain but no AANH domain are likely to be annotated as “unknown function,” exactly the reverse of reality. To understand where this confusion begins, we need to recall that the thiI gene is allelic to the nuvA gene of E. coli .I t has been shown that ThiI/NuvA is responsible for thiolating the uridine residue in tRNA. Thiolation of tRNA requires all three domains of ThiI, especially the second domain, which is involved in activating the target (by adenylylation) and transferring the sulfur from the rhodanese domain to the target. The thiolation reaction in thiamine biosynthesis also requires activation by adenylylation and transfer of a disulfide bond. Thus, it was tempting to predict (incorrectly) that the chemistry of the thiolation in thiamine synthesis would be completely analogous. The bioinformatic method of “prediction by analogy” offers great utility and convenience but tenuous logic for naming genes and proteins. Thus, if the thiI (nuvA) gene from E. coli contains a domain for binding, a domain for adenylylating, and a domain for transferring sulfur, then the biochemical function of ThiI (NuvA) protein in S. enterica or any organism with a similar gene can be predicted (by analogy): it will bind, adenylylate, and transfer sulfur. Moreover (and even more dangerously), the physiological role of such similar proteins can also be predicted (by analogy): they will be required for thiamine synthesis. Each of these predictions involves a probabilistic argument, and the compounding of multiple probabilities degrades the confidence level of such “predictions by analogy.” Several facts about ThiI are well established: thiI mutants are auxotrophic for thiamine, specifically for the thiazole component of this essential vitamin (7). thiI is allelic to a previously characterized E. coli gene, nuvA (3). ThiI/NuvA is the enzyme responsible for the thiolation of a uridine residue in tRNA. Three domains of ThiI are essential for the thiolation of tRNA: a THUMP domain that binds tRNA, an AANH domain that activates the uridine residue by adenylylation (9), and a rhodanese domain that transfers sulfur to the activated

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call