The protein databases contain many proteins with unknown function. A computational approach for predicting ligand specificity that requires only the sequence of the unknown protein would be valuable for directing experiment-based assignment of function. We focused on a family of unknown proteins in the mechanistically diverse enolase superfamily and used two approaches to assign function: (i) enzymatic assays using libraries of potential substrates, and (ii) in silico docking of the same libraries using a homology model based on the most similar (35% sequence identity) characterized protein. The results matched closely; an experimentally determined structure confirmed the predicted structure of the substrate-liganded complex. We assigned the N-succinyl arginine/lysine racemase function to the family, correcting the annotation (L-Ala-D/L-Glu epimerase) based on the function of the most similar characterized homolog. These studies establish that ligand docking to a homology model can facilitate functional assignment of unknown proteins by restricting the identities of the possible substrates that must be experimentally tested.
Read full abstract