Abstract

BackgroundHigh quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence. These evidences include results from a diversity of ab initio prediction programs as well as homology-based searches. Most of these programs operate on a single contiguous sequence at a time, and the results are generated in a diverse array of readable formats that must be translated to a standardized file format. These translated results must then be concatenated into a single source, and then presented in an integrated form for human curation.ResultsWe have designed, implemented, and assessed a Perl-based workflow named DAWGPAWS for the generation of computational results for human curation of the genes and transposable elements in plant genomes. The use of DAWGPAWS was found to accelerate annotation of 80–200 kb wheat DNA inserts in bacterial artificial chromosome (BAC) vectors by approximately twenty-fold and to also significantly improve the quality of the annotation in terms of completeness and accuracy.ConclusionThe DAWGPAWS genome annotation pipeline fills an important need in the annotation of plant genomes by generating computational evidences in a high throughput manner, translating these results to a common file format, and facilitating the human curation of these computational results. We have verified the value of DAWGPAWS by using this pipeline to annotate the genes and transposable elements in 220 BAC insertions from the hexaploid wheat genome (Triticum aestivum L.). DAWGPAWS can be applied to annotation efforts in other plant genomes with minor modifications of program-specific configuration files, and the modular design of the workflow facilitates integration into existing pipelines.

Highlights

  • High quality annotation of the genes and transposable elements in complex genomes requires a human-curated integration of multiple sources of computational evidence

  • In annotation of 220 bacterial artificial chromosome (BAC) from hexaploid bread wheat, we found that the DAWGPAWS pipeline increased the rate of individual BAC annotations by twenty-fold

  • The DAWGPAWS annotation workflow provides a suite of command line interface programs that can generate computational evidences for human curation in a highthroughput fashion

Read more

Summary

Plant Methods

The DAWGPAWS pipeline for the annotation of genes and transposable elements in plant genomes. Address: 1Department of Plant Biology, The University of Georgia, Athens, Georgia 30602-7271, USA and 2Department of Genetics, The University of Georgia, Athens, Georgia 30602-7223, USA

Results
Conclusion
Background
Result
Results and discussion
Bennetzen JL
25. Pereira V
37. Donlin MJ
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call