Abstract

The combination of genomic sequencing with structural genomics has provided a wealth of new structures for previously uncharacterized ORFs, more commonly referred to as hypothetical proteins. This rapid growth has been the direct result of high-throughput, automated approaches in both the identification of new ORFs and the determination of high-resolution 3-D protein structures. A significant bottleneck is reached, however, at the stage of functional annotation in that the assignment of function is not readily automatable. It is often the case that the initial structural analysis at best indicates a functional family for a given hypothetical protein, but further identification of a relevant ligand or substrate is impeded by the diversity of function in a particular structural classification of proteins family, a highly selective and specific ligand-binding site, or the identification of a novel protein fold. Our approach to the functional annotation of hypothetical proteins relies on the combination of structural information with additional bioinformatics evidence garnered from operon prediction, loose functional information of additional operon members, conservation of catalytic residues, as well as cocrystallization trials and virtual ligand screening. The synthesis of all available information for each protein has permitted the functional annotation of several hypothetical proteins from Escherichia coli and each assignment has been confirmed through generally accepted biochemical methods.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call