Advances in sequencing technologies have led to a rapid growth of public protein sequence databases, whereby the fraction of proteins with experimentally verified function continuously decreases. This problem is currently addressed by automated functional annotations with computational tools, which however lack the accuracy of experimental approaches and are susceptible to error propagation. Here, we present an approach that combines the efficiency of functional annotation by in silico methods with the rigor of enzyme characterization in vitro. First, a thorough experimental analysis of a representative enzyme of a group of homologues is performed which includes a focused alanine scan of the active site to determine a fingerprint of function-determining residues. In a second step, this fingerprint is used in combination with a sequence similarity network to identify putative isofunctional enzymes among the homologues. Using this approach in a proof-of-principle study, homologues of the histidinol phosphate phosphatase (HolPase) from Pseudomonas aeruginosa, many of which were annotated as phosphoserine phosphatases, were predicted to be HolPases. This functional annotation of the homologues was verified by in vitro testing of several representatives and an analysis of the occurrence of annotated HolPases in the corresponding phylogenetic groups. Moreover, the application of the same approach to the homologues of the HolPase from the archaeon Nitrosopumilus maritimus, which is not related to the HolPase from P. aeruginosa and was newly discovered in the course of this work, led to the annotation of the putative HolPase from various archaeal species.
Read full abstract