Abstract
Protein families are comprised of numerous sequences that adopt a similar three-dimensional structure and functional properties. The superfamily of cytochromes P450 are an excellent example of a common structural scaffold being utilized for a variety of biological functions. This functional diversity is achieved in Nature through millions of years of evolution to create new and diverse sequences. We have used site-directed recombination guided by the computation algorithm SCHEMA to create an artificial family of cytochromes P450 in the laboratory. Members of this family possess unique properties such as altered activity profiles, increased thermostability and the ability to accept new substrates. We developed screening tools for the rapid analysis of hundreds of individual P450s. These high-throughput assays include the 4-aminoantipyrine (4-AAP) assay which is capable of detecting the hydroxylation of an aromatic ring. High-throughput carbon monoxide binding facilitates the rapid detection of P450s that correctly incorporate a heme cofactor and are thus properly folded and potentially functional. Finally, a substrate binding assay which measures a spectral shift that occurs when a substrate binds in a P450 active site is described. Fourteen double-crossover chimeras created from the bacterial P450s CYP102A1 and CYP102A2 were constructed to calibrate the P450 scaffold for SCHEMA, a computational algorithm used to minimize structural disruption in chimeric proteins. We found that only chimeras with high levels of structural disruption as measured by SCHEMA were unfolded. Among the fourteen chimeras we also observed three different activity profiles based on peroxygenase kinetic assays with the substrates p-nitrophenoxydodecanoic acid (12-pNCA), 2-phenoxyethanol and allyloxybenzene. We applied this calibration to create an artificial family comprising ~3,000 chimeric heme P450 proteins that correctly fold and incorporate a heme cofactor by recombining three cytochromes P450 at seven crossover locations chosen to minimize structural disruption. Members of this protein family differ from any known sequence at an average of 72 and by as many as 109 amino acids. Most (>73%) of the properly folded chimeric P450 heme proteins are catalytically active peroxygenases; some are more thermostable than the parent proteins. A multiple sequence alignment of 955 chimeras, including both folded and not, was analyzed using logistic regression analysis (LRA) to identify key structural contributions to cytochrome P450 heme incorporation and peroxygenase activity and suggests possible structural differences between parents CYP102A1 and CYP102A2. Thirty-four members of this artificial family were assayed for functional diversity on a set of eight substrates. P450 chimeras were able to exceed the parents in total activity on all eight substrates and were grouped into five different groups based on activity profiles using K-means clustering. Activity profiles on eight substrates were then performed in high throughput to produce a data set of 330 chimeras. The mean percent standard deviation of the activity assays showed the reproducibility of these high-throughput data and further analysis may reveal information about sequence-structure-function relationships. The products of the catalytic reactions of four chimeric P450s with substrates of human P450s, some of which are drug compounds, were analyzed by HPLC in order to determine their identity. Chimeras were able to produce authentic human metabolites of chlorzoxazone, zoxazolamine and propranolol, showed peroxidase acitivty on 4-aminobiphenyl and produced an unknown product with tolbutamide. Finally, the peroxygenase activity of a mutant P450 heme domain is able to be further altered and enhanced using directed evolution. After two rounds of directed evolution and screening with the 4-AAP assay, we found mutants with altered substrate specificities and an overall enhancement of activity. The design and high-throughput methodologies described here can be used to create artificial protein families and to discover new and useful protein sequences. Like natural protein families, artificial protein families can be used to identify regions of protein sequence and structure that are important for folding and function. This is especially useful for analyzing protein families with few members or for validating tools for structure prediction and for protein sequence-structure-function analysis. Artificial protein families are also rich in sequence diversity and can provide sources of novel protein function. Using the high-throughput methodologies described here, chimeric P450s with enhanced activity, altered activity profiles, and the ability to hydroxylate drug-like compounds to produce authentic human metabolites were discovered in our artificial family of P450s. These methodologies will hopefully be extended to the study of other protein families and to the creation and discovery of increasingly valuable protein catalysts.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have