Abstract

BackgroundIn silico promoter prediction represents an important challenge in bioinformatics as it provides a first-line approach to identifying regulatory elements to support wet-lab experiments. Historically, available promoter prediction software have focused on sigma factor-associated promoters in the model organism E. coli. As a consequence, traditional promoter predictors yield suboptimal predictions when applied to other prokaryotic genera, such as Pseudomonas, a Gram-negative bacterium of crucial medical and biotechnological importance.ResultsWe developed SAPPHIRE, a promoter predictor for σ70 promoters in Pseudomonas. This promoter prediction relies on an artificial neural network that evaluates sequences on their similarity to the − 35 and − 10 boxes of σ70 promoters found experimentally in P. aeruginosa and P. putida. SAPPHIRE currently outperforms established predictive software when classifying Pseudomonas σ70 promoters and was built to allow further expansion in the future.ConclusionsSAPPHIRE is the first predictive tool for bacterial σ70 promoters in Pseudomonas. SAPPHIRE is free, publicly available and can be accessed online at www.biosapphire.com. Alternatively, users can download the tool as a Python 3 script for local application from this site.

Highlights

  • In silico promoter prediction represents an important challenge in bioinformatics as it provides a first-line approach to identifying regulatory elements to support wet-lab experiments

  • Users can download the tool as a Python 3 script for local application from this site

  • Data SAPPHIRE was trained using a dataset of 170 unique Pseudomonas σ70 promoters (Additional file 1)

Read more

Summary

Results

Benchmarking Due to the imbalanced nature of the dataset, accuracy was expected to accommodate a bias towards specificity and mask the minority-class performance (true positive rate). The model for Pseudomonas σ70 promoters was compared to two established online tools for bacterial promoter prediction, BPROM [13] and CNNPromoter_b [14] Both the complete and test dataset were analyzed using SAPPHIRE, BPROM and CNNPromoter_b (Table 1). The notably poor score of 18.8% for sensitivity on the complete dataset (32 out of 170 promoter sequences detected) from both BPROM and CNNPromoter_b corroborates the need for Pseudomonas specific promoter prediction tools. Autographivirinae, are known to encode σ70 promoters at the left end of their genomes, driving expression of a phageencoded RNA polymerase that subsequently transcribes the remainder of phage genes These biologically consistent findings on phage genomes suggest that SAPPHIRE provides suitable predictions for multiple members across the Pseudomonas genus

Conclusions
Background
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call