Abstract

Evolution has provided us with many protein sequences. However, these sequences represent a very small fraction of the possible sequences. In the laboratory, scientists have explored areas of sequence space not represented by natural proteins both to better understand natural proteins, and to create new proteins with desirable properties. The principle mechanism used to explore protein sequence space is mutagenesis. However, recombination of homologous genes can also explore regions of sequence space rich with folded and functional proteins. In this work we demonstrated using a beta-lactamase model system that a computation energy function (SCHEMA) can predict which of the chimeras made by recombining distantly related proteins are likely to fold. SCHEMA uses protein sequence and structure information to identify pairwise amino acid interactions disrupted by recombination. Using SCHEMA we designed libraries of chimeric beta-lactamases. These libraries were intended to have a high fraction of folded variants, while incorporating many amino acid substitutions compared with the parental proteins. The chimeras in these libraries were characterized to determine whether they retain the parental function and what new substrate specificities could be obtained. To identify critical variables for determining whether a chimera functions, we used logistic regression analysis to analyze functional and nonfunctional chimeras. From this analysis it is apparent that both two-body (pairwise) and one-body terms play a significant role in determining whether a chimera functions. We also used random mutagenesis to restore functionality to nonfunctional chimeras showing that a thermostabilizing mutation can rescue approximately 5% of the nonfunctional chimeras. The one-body terms that appear significant for determining whether a chimera functions are not explicitly counted by SCHEMA when predicting chimera folding. To estimate the effects on chimera folding represented by the one-body terms, we developed an additional measure to predict chimera folding based on just the chimera amino acid sequence and a multiple sequence alignment of homologous proteins. This measure is predictive of chimera folding alone, and when combined with the pairwise SCHEMA energy increases the accuracy of the folding predictions compared to SCHEMA.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call