Abstract

BackgroundThe quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement. Increasing the number of correct identifications, both of genes and of the translation initiation sites for each gene, and reducing the overall number of false positives, are all desirable goals.ResultsWith our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused specifically on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives.ConclusionWe built a fast, lightweight, open source gene prediction program called Prodigal http://compbio.ornl.gov/prodigal/. Prodigal achieved good results compared to existing methods, and we believe it will be a valuable asset to automated microbial annotation pipelines.

Highlights

  • The quality of automated gene prediction in microbial organisms has improved steadily over the past decade, but there is still room for improvement

  • Existing methods for bacterial and archaeal gene prediction include the popular Glimmer [1] and GenemarkHMM [2] packages, both of which are included at NCBI alongside Genbank [3] annotations (Prodigal is included), as well as other methods such as Easygene [4] and MED [5]

  • The EcoGene Verified Protein Starts set [12] remains the only large set of experimentally verified genes and translation start sites for typical bacteria. Another large set was produced for two archaea, Halobacterium salinarum and Natronomonas pharaonis, using a special proteomics technique for extracting N-terminal peptides [17]

Read more

Summary

Results

With our years of experience in manually curating genomes for the Joint Genome Institute, we developed a new gene prediction algorithm called Prodigal (PROkaryotic DYnamic programming Gene-finding ALgorithm). With Prodigal, we focused on the three goals of improved gene structure prediction, improved translation initiation site recognition, and reduced false positives. We compared the results of Prodigal to existing gene-finding methods to demonstrate that it met each of these objectives

Conclusion
Background
Results and Discussion
Conclusions
12. Rudd K
14. Bellman R
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call