Abstract

BackgroundYeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies) and from de novo sequencing projects (new species). However, the annotation of genomes presents a major bottleneck for de novo projects, because it still relies on a process that is largely manual.ResultsHere we present the Yeast Genome Annotation Pipeline (YGAP), an automated system designed specifically for new yeast genome sequences lacking transcriptome data. YGAP does automatic de novo annotation, exploiting homology and synteny information from other yeast species stored in the Yeast Gene Order Browser (YGOB) database. The basic premises underlying YGAP's approach are that data from other species already tells us what genes we should expect to find in any particular genomic region and that we should also expect that orthologous genes are likely to have similar intron/exon structures. Additionally, it is able to detect probable frameshift sequencing errors and can propose corrections for them. YGAP searches intelligently for introns, and detects tRNA genes and Ty-like elements.ConclusionsIn tests on Saccharomyces cerevisiae and on the genomes of Naumovozyma castellii and Tetrapisispora blattae newly sequenced with Roche-454 technology, YGAP outperformed another popular annotation program (AUGUSTUS). For S. cerevisiae and N. castellii, 91-93% of YGAP's predicted gene structures were identical to those in previous manually curated gene sets. YGAP has been implemented as a webserver with a user-friendly interface at http://wolfe.gen.tcd.ie/annotation.

Highlights

  • Yeasts are a model system for exploring eukaryotic genome evolution

  • After the approximate correspondence between a region of the newly-sequenced genome and a region of the Ancestral genome has been established, the gene content of that Ancestral region can be used to improve the annotation of the corresponding region in the new genome – for example to make decisions about the correct orthology relationships for genes that are members of multigene families, or to find genes that were not initially annotated but which are expected to be present in the region because they are present in the syntenic region in other species [32]. In this manuscript we present YGAP (Yeast Genome Annotation Pipeline), the pipeline we developed to carry out automated annotation by this approach

  • Tests with S. cerevisiae To test YGAP's performance we ran an automatic annotation of the genome of S. cerevisiae, which is very well studied and annotated

Read more

Summary

Introduction

Yeasts are a model system for exploring eukaryotic genome evolution. Yeasts provide an excellent system for exploring eukaryotic genome evolution by comparative genomics because their genomes are compact (9–20 Mb with 4700–6500 genes) with few introns, making them straightforward to sequence, but they still retain extensive synteny across deep phylogenetic distances [1,2,3,4,5]. Yeast comparative genomics has produced many insights into genome evolution, including the discovery of whole-genome duplication (WGD) [7]; development of methods for identifying conserved regulatory elements and RNA genes [8,9,10]; exploration of changes in the genetic code [11]; and detection of horizontal gene transfer and its functional consequences [12,13]. A similar approach was conducted with the pathogenic basidiomycete yeast Cryptococcus neoformans, responsible for cryptococcal meningitis [14]

Objectives
Methods
Results
Discussion
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.