Abstract

BackgroundPromoter strength plays a critical role in modulating protein expression in genetic engineering. However, there are only a few studies on the strength of promoters from the comprehensive genomic database of sigma factors. To circumvent the time and resource-intensive experimental approach, artificial intelligence (AI) is considered to construct a complete database of proposed promoters from Escherichia coli, and further utilizing prediction algorithms to evaluate the promoter strength and confirmed using intensity of green fluorescent protein (GFP). MethodsThe promoter database was constructed using partial information from Ecocyc, and predictive strength of the promoters was calculated via the phiSITE hunter tool. Among the 1744 promoter entries in the database were derived from E. coli MG1655, while total of 935 sigma factor 70 (σ70) promoters were identified. Then, the training database was applied to develop a precise tool for predicting promoter strength using machine learning and six deep learning models. The accuracy of predictions was confirmed through wet experiments conducted on endogenous and J-series promoters. Significant findingsBy employing a deep learning model, particularly the Convolutional Neural Network (CNN), the promoter prediction fitness of phiSITE, which relied on traditional alignment metrics, was approved. On the other hand, phiSITE demonstrated satisfied result in the fluorescence experiments using 7 endogenous promoters, achieving an R-squared (R2) at 0.93. When applied the same model to predict the strength of J-series promoters, the best R2 achieved 0.99. Thus, CNN model represents as an effective evaluation of AI-based promoter strength.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call