Abstract

BackgroundThe complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need.ResultsWe have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities.ConclusionsCPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas.

Highlights

  • The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species

  • We have developed a web server Chloroplast Genome Annotation, Visualization, Analysis, and GenBank Submission (CPGAVAS) in order to provide functions that support standard practices for annotating and analyzing chloroplast genome sequences, which are missing in DOGMA

  • The output includes several files that contain: (1) annotation results in GFF3 format; (2) circular map of the annotated chloroplast genome in png format; (3) tables describing summary statistics of the genome; (4) annotation results combined with other user information in Sequin format

Read more

Summary

Introduction

The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. Regions on chloroplast genomes have been widely used as phylogenetic [1,2] and DNA barcoding markers [3,4,5] to determine the phylogenetic relationships of organisms and the identity of particular DNA samples. The complete sequences of chloroplast genomes automatic annotation software, repeated manual editing by domain experts is required. The annotation results need to be submitted to GenBank for publication. Carrying out these steps can be tedious and time consuming for bench scientists. They can become a bottle neck with the deluge of complete chloroplast genome sequences

Objectives
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.