Abstract

The genome of the soil bacteria Anaeromyxobacter dehalogenans (Adhal) is reported to have 4346 coding sequences composed of 1,521,374 codons, a GC content of 74.82% and GC occupancy of the third base position of 97.07%. Over 50% of the proteins in Adhal are annotated as having uncertain or unknown function. Over 1000 have two or less Met residues and Val and Leu residues are often identified as start codes. Examination of the sequences indicates that over 80% of them have 3 open reading frames. The presence of multiple open reading frames and ambiguity of start and stop code identification has resulted in misidentification of hundreds of nonsense sequences as hypothetical proteins. Significant differences in codon use between functionally characterized and hypothetical proteins, errors in coding frame selection, and in start and stop code identifications indicate that most (and possibly all) of the sixteen codons that have A or T bases in the first and third position are nonsense codes (de facto stops). These patterns present in many bacterial genomes suggest that the modern genetic code evolved from a coding system in which just the 32 codons that end in G or C define all 20 amino acids. The DNA in which this code evolved was stabilized by the presence of three hydrogen bonds linking the GC pairs present in every third base pair of the DNA encoding proteins.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.