Abstract

BackgroundPrediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics. Theoretical assignment of three-dimensional folds and prediction of protein function even at a very general level can facilitate the experimental determination of the molecular mechanism of action and the role that members of a given protein family fulfill in the cell. Here, we predict the three-dimensional fold and study the phylogenomic distribution of members of a large family of uncharacterized proteins classified in the Clusters of Orthologous Groups database as COG4636.ResultsUsing protein fold-recognition we found that members of COG4636 are remotely related to Holliday junction resolvases and other nucleases from the PD-(D/E)XK superfamily. Structure modeling and sequence analyses suggest that most members of COG4636 exhibit a new, unusual variant of the putative active site, in which the catalytic Lys residue migrated in the sequence, but retained similar spatial position with respect to other functionally important residues. Sequence analyses revealed that members of COG4636 and their homologs are found mainly in Cyanobacteria, but also in other bacterial phyla. They undergo horizontal transfer and extensive proliferation in the colonized genomes; for instance in Gloeobacter violaceus PCC 7421 they comprise over 2% of all protein-encoding genes. Thus, members of COG4636 appear to be a new type of selfish genetic elements, which may fulfill an important role in the genome dynamics of Cyanobacteria and other species they invaded. Our analyses provide a platform for experimental determination of the molecular and cellular function of members of this large protein family.ConclusionAfter submission of this manuscript, a crystal structure of one of the COG4636 members was released in the Protein Data Bank (code 1wdj; Idaka, M., Wada, T., Murayama, K., Terada, T., Kuramitsu, S., Shirouzu, M., Yokoyama, S.: Crystal structure of Tt1808 from Thermus thermophilus Hb8, to be published). Our analysis of the Tt1808 structure reveals that we correctly predicted all functionally important features of the COG4636 family, including the membership in the PD-(D/E)xK superfamily of nucleases, the three-dimensional fold, the putative catalytic residues, and the unusual configuration of the active site.

Highlights

  • Prediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics

  • Sequence analysis of COG4636 reveals remote similarity to PD-(D/E)XK nucleases In the course of analyses of proteins with unknown structures, we came across a family of sequences grouped together in the Clusters of Orthologous Groups (COG) database [30] as COG4636 and annotated as "uncharacterized protein conserved in Cyanobacteria"

  • Preliminary analysis of sequence conservation combined with secondary structure prediction revealed a characteristic pattern of α-helices and β-strands associated with conserved carboxylate residues, which suggested that members of COG4636 may belong to the PD-(D/E)XK superfamily (Figure 1)

Read more

Summary

Introduction

Prediction of structure and function for uncharacterized protein families by identification of evolutionary links to characterized families and known structures is one of the cornerstones of genomics. All members of the PD-(D/E)XK superfamily share a common structural core, comprising a mixed β-sheet of 4 or 5 strands flanked on both sides by α-helices [1,2,12] These secondary structures are often embedded in very different peripheral elements, which sometimes constitute the majority of the protein. In some REases, the acidic residue from the (D/E)XK halfmotif was found to have "migrated" to another region of the polypeptide in a way that the position of the carboxylate group in the active site is generally maintained as in the "orthodox" members of the PD-(D/E)XK superfamily, despite the side chain is attached to another place in the backbone [16,17,18,19]. The conserved Lys was found to be replaced by a Glu, Gln, or Asn residue [20,21,22]

Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call