Abstract

Lignocellulases are the most important enzymes for bioeconomy development and have gained many interests in mining new coding-genes from metagenomic DNA data recently. However, the identification of genes suitable for successful expression in E. coli for the enzyme characterization is still a big challenge. In this study, 18 lignocellullase genes from metagenomic data of bacteria in goats' rumen, termite gut and humus were expressed in E. coli. Then 18 nucleotide and amino acid sequences were used to measure 12 impact features and to investigate tools for prediction of their E. coli expressibility. The features closely related to the enzymes expression level included aliphatic side chains of amino acids (aliphatic index: AI), grand average of hydropathicity, protein folding ability (fold index: FI) and FI was the most important factor. The investigation of two models for prediction of the enzymes expressibility in E. coli showed that 100% sequences predicted to be high expressibility by Periscope were the sequences expressed at high level in experiments. In contrast, 100% sequences predicted to be low expressibilty by ESPRESSO constituted a low expression level in experiments. This result can be a good reference for screening genes before expressing in E. coli.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call