Abstract

BackgroundSince the recombinant protein was discovered, it has become more popular in many aspects of life science. The value of global pharmaceutical market was $87 billion in 2008 and the sales for industrial enzyme exceeded $4 billion in 2012. This is strong evidence showing the great potential of recombinant protein. However, native genes introduced into a host can cause incompatibility of codon usage bias, GC content, repeat region, Shine-Dalgarno sequence with host’s expression system, so the yields can fall down significantly. Hence, we propose novel methods for gene optimization based on neural network, Bayesian theory, and Euclidian distance.ResultThe correlation coefficients of our neural network are 0.86, 0.73, and 0.90 in training, validation, and testing process. In addition, genes optimized by our methods seem to associate with highly expressed genes and give reasonable codon adaptation index values. Furthermore, genes optimized by the proposed methods are highly matched with the previous experimental data.ConclusionThe proposed methods have high potential for gene optimization and further researches in gene expression. We built a demonstrative program using Matlab R2014a under Mac OS X. The program was published in both standalone executable program and Matlab function files. The developed program can be accessed from http://www.math.hcmus.edu.vn/~ptbao/paper_soft/GeneOptProg/.

Highlights

  • Since the recombinant protein was discovered, it has become more popular in many aspects of life science

  • A Shapiro-Wilk test shows that almost all data do not fit a normal distribution, so we used a nonparametric Wilcoxon signed-rank test to investigate whether there is any significant difference between the correlation given by neural network (NN) and linear model, Table 2 [44]

  • In this study, we proposed the uses of HEG probability (HEGP), distance to Highly expressed genes/gene (HEG) (DHEG), and NN to optimize genes and indicated an approach to estimate parameters for linear function in gene optimization

Read more

Summary

Introduction

Since the recombinant protein was discovered, it has become more popular in many aspects of life science. The value of global pharmaceutical market was $87 billion in 2008 and the sales for industrial enzyme exceeded $4 billion in 2012. This is strong evidence showing the great potential of recombinant protein. We propose novel methods for gene optimization based on neural network, Bayesian theory, and Euclidian distance. Elena’s study indicated that the global market of industrial enzymes exceeded $4 billion in 2012 [2]. In the future, this figure can be raised considerably thanks to the applications of synthetic biology tools which will improve the productivity of recombinant proteins production.

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.