Abstract

The majority of protein sequence data published today is of metagenomic origin. However, our ability to assign functions to these sequences is often hampered by our general inability to cultivate the larger part of microbial species and the sheer amount of sequence data generated in these projects. Here we present a combination of bioinformatics, synthetic biology, and Escherichia coli genetics to discover biocatalysts in metagenomic datasets. We created a subset of the Global Ocean Sampling dataset, the largest metagenomic project published to date, by removing all proteins that matched Hidden Markov Models of known protein families from PFAM and TIGRFAM with high confidence (E-value > 10-5). This essentially left us with proteins with low or no homology to known protein families, still encompassing ~1.7 million different sequences. In this subset, we then identified protein families de novo with a Markov clustering algorithm. For each protein family, we defined a single representative based on its phylogenetic relationship to all other members in that family. This reduced the dataset to ~17,000 representatives of protein families with more than 10 members. Based on conserved regions typical for lipases and esterases, we selected a representative gene from a family of 27 members for synthesis. This protein, when expressed in E. coli, showed lipolytic activity toward para-nitrophenyl (pNP) esters. The Km-value of the enzyme was 66.68 μM for pNP-butyrate and 68.08 μM for pNP-palmitate with kcat/Km values at 3.4 × 106 and 6.6 × 105 M-1s-1, respectively. Hydrolysis of model substrates showed enantiopreference for the R-form. Reactions yielded 43 and 61% enantiomeric excess of products with ibuprofen methyl ester and 2-phenylpropanoic acid ethyl ester, respectively. The enzyme retains 50% of its maximum activity at temperatures as low as 10°C, its activity is enhanced in artificial seawater and buffers with higher salt concentrations with an optimum osmolarity of 3,890 mosmol/l.

Highlights

  • The global market of industrial enzymes is estimated to have a value between $ 4.8 billion and $ 5.1 billion and rising (BCC Research, 2014; The Freedonia Group, 2014)

  • The 6,123,395 sequences of potential proteins from the Global Ocean Sampling (GOS) project were searched for potential novel lipolytic enzymes

  • We present here a combined bioinformatics/biochemistry approach for the discovery of new biocatalytic enzymes from metagenomic datasets

Read more

Summary

Introduction

The global market of industrial enzymes is estimated to have a value between $ 4.8 billion and $ 5.1 billion and rising (BCC Research, 2014; The Freedonia Group, 2014). Metagenomic lipase discovery various applications in organic chemistry (Jaeger and Eggert, 2002). They are able to economically outcompete some conventional chemical methods, based on their high enantioselectivity, substrate specificity, and mild reaction conditions (Divakar and Manohar, 2007). Esterases (EC 3.1.1.1) and lipases (EC 3.1.1.3), are ubiquitous enzymes that hydrolyze organic ester bonds in aqueous solutions. The former hydrolyze small molecules with ester bonds, which are at least partly soluble in water, whereas the latter display a maximum of activity toward long-chained and insoluble triglycerides (Jaeger et al, 1999)

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call