Abstract

BackgroundIn the search for generic expression strategies for mammalian protein families several bacterial expression vectors were examined for their ability to promote high yields of soluble protein. Proteins studied included cell surface receptors (Ephrins and Eph receptors, CD44), kinases (EGFR-cytoplasmic domain, CDK2 and 4), proteases (MMP1, CASP2), signal transduction proteins (GRB2, RAF1, HRAS) and transcription factors (GATA2, Fli1, Trp53, Mdm2, JUN, FOS, MAD, MAX). Over 400 experiments were performed where expression of 30 full-length proteins and protein domains were evaluated with 6 different N-terminal and 8 C-terminal fusion partners. Expression of an additional set of 95 mammalian proteins was also performed to test the conclusions of this study.ResultsSeveral protein features correlated with soluble protein expression yield including molecular weight and the number of contiguous hydrophobic residues and low complexity regions. There was no relationship between successful expression and protein pI, grand average of hydropathicity (GRAVY), or sub-cellular location. Only small globular cytoplasmic proteins with an average molecular weight of 23 kDa did not require a solubility enhancing tag for high level soluble expression. Thioredoxin (Trx) and maltose binding protein (MBP) were the best N-terminal protein fusions to promote soluble expression, but MBP was most effective as a C-terminal fusion. 63 of 95 mammalian proteins expressed at soluble levels of greater than 1 mg/l as N-terminal H10-MBP fusions and those that failed possessed, on average, a higher molecular weight and greater number of contiguous hydrophobic amino acids and low complexity regions.ConclusionsBy analysis of the protein features identified here, this study will help predict which mammalian proteins and domains can be successfully expressed in E. coli as soluble product and also which are best targeted for a eukaryotic expression system. In some cases proteins may be truncated to minimise molecular weight and the numbers of contiguous hydrophobic amino acids and low complexity regions to aid soluble expression in E. coli.

Highlights

  • In the search for generic expression strategies for mammalian protein families several bacterial expression vectors were examined for their ability to promote high yields of soluble protein

  • We found that successful soluble expression of the 95 protein set correlated with lower molecular weight, number of low complexity regions and contiguous hydrophobic amino acids compared with proteins that failed to express solubly with the correct size, validating our earlier conclusions, there were some exceptions

  • What guidelines have emerged from this study in developing a strategy for the production of soluble mammalian proteins in E. coli? If the protein has a molecular weight of less than 30 KDa and contains 1 or less low complexity regions and less than 4 contiguous hydrophobic amino acids expression of the full-length protein in E. coli should give good levels of soluble protein

Read more

Summary

Introduction

In the search for generic expression strategies for mammalian protein families several bacterial expression vectors were examined for their ability to promote high yields of soluble protein. The availability of proteins is important for biomedical applications such as small molecule drug discovery and the production of therapeutic proteins and vaccines In these situations it is essential to be able to reliably express the proteins in a heterologous system and purify them so that they possess the same folds and structure as they would in a natural in vivo state. To achieve this on a whole proteome scale a generic approach must be taken to the expression of protein families, unlike the traditional approach of protein chemistry in optimising the isolation of individual proteins on a case by case basis. Purification of inclusion bodies and in vitro refolding has been used in a number of cases, but refolding conditions are highly protein specific and so unlikely to be useful for high-throughput protein expression

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call