Abstract

BackgroundSince DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced. However, many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. While maintaining comprehensive genomic databases for a set of virus families at the Viral Bioinformatics Resource Center and Viral Bioinformatics – Canada , we found that researchers were unnecessarily spending time annotating viral genomes that were close relatives of already annotated viruses. We have therefore designed and implemented a novel tool, Genome Annotation Transfer Utility (GATU), to transfer annotations from a previously annotated reference genome to a new target genome, thereby greatly reducing this laborious task.ResultsGATU transfers annotations from a reference genome to a closely related target genome, while still giving the user final control over which annotations should be included. GATU also detects open reading frames present in the target but not the reference genome and provides the user with a variety of bioinformatics tools to quickly determine if these ORFs should also be included in the annotation. After this process is complete, GATU saves the newly annotated genome as a GenBank, EMBL or XML-format file. The software is coded in Java and runs on a variety of computer platforms. Its user-friendly Graphical User Interface is specifically designed for users trained in the biological sciences.ConclusionGATU greatly simplifies the initial stages of genome annotation by using a closely related genome as a reference. It is not intended to be a gene prediction tool or a "complete" annotation system, but we have found that it significantly reduces the time required for annotation of genes and mature peptides as well as helping to standardize gene names between related organisms by transferring reference genome annotations to the target genome.The program is freely available under the General Public License and can be accessed along with documentation and tutorial from .

Highlights

  • Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced

  • The program is freely available under the General Public License and can be accessed along with documentation and tutorial from http://www.virology.ca/gatu

  • In order to facilitate the process of annotation, we have developed a tool, Genome Annotation Transfer Utility (GATU), which makes use of the fact that most unannotated genomes are closely related to previously annotated genomes

Read more

Summary

Introduction

Since DNA sequencing has become easier and cheaper, an increasing number of closely related viral genomes have been sequenced Many of these have been deposited in GenBank without annotations, severely limiting their value to researchers. This ability to gather larger collections of genome sequences has opened up new avenues of research, it has led to significant problems related to data management and sequence annotation. Examples of this data explosion include the following, all found in GenBank: 1) 1201 nearly complete genomes of human immunodeficiency virus (HIV); 2) 53 complete poxvirus genomes, with genomes ranging in size from 134 – 360 kb; 3) more than 125 SARS genomes submitted since the first two SARS coronavirus genomes were published in May, 2003. The application can be run on most major operating systems including Mac OS X, Windows and Linux

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call