Abstract

Genome annotation is the process of identifying the location and function of a genome's encoded features. Improving the biological accuracy of annotation is a complex and iterative process requiring researchers to review and incorporate multiple sources of information such as transcriptome alignments, predictive models based on sequence profiles, and comparisons to features found in related organisms. Because rapidly decreasing costs are enabling an ever-growing number of scientists to incorporate sequencing as a routine laboratory technique, there is widespread demand for tools that can assist in the deliberative analytical review of genomic information. To this end, we present Apollo, an open source software package that enables researchers to efficiently inspect and refine the precise structure and role of genomic features in a graphical browser-based platform. Some of Apollo’s newer user interface features include support for real-time collaboration, allowing distributed users to simultaneously edit the same encoded features while also instantly seeing the updates made by other researchers on the same region in a manner similar to Google Docs. Its technical architecture enables Apollo to be integrated into multiple existing genomic analysis pipelines and heterogeneous laboratory workflow platforms. Finally, we consider the implications that Apollo and related applications may have on how the results of genome research are published and made accessible.

Highlights

  • Apollo’s design is based on the premise that the best genomic descriptions (‘annotations’) can be produced by starting with automatically-generated sequence features and providing expert researchers with interactive editing tools to examine these multiple sources of evidence and collaboratively refine the genomic annotations

  • To briefly describe the basic capabilities, Apollo’s Genomic Editing Workspace displays tracks of information gathered from upstream pipelines and individual users’ analyses

  • We have encountered situations where a research group is studying many species in a particular clade; large, geographically distributed teams focused on a particular genomic region; and many students in a class working on team projects

Read more

Summary

Introduction

Apollo’s design is based on the premise that the best genomic descriptions (‘annotations’) can be produced by starting with automatically-generated sequence features and providing expert researchers with interactive editing tools to examine these multiple sources of evidence and collaboratively refine the genomic annotations. Curation of some gene models of the yellow potato cyst nematode, Globodera rostochiensis, using RNAseq alignments as evidence, revealed a high frequency of non-canonical splice sites Subsequent use of these manually curated genes as a training set markedly improved the automated gene predictions [36]. To briefly describe the basic capabilities, Apollo’s Genomic Editing Workspace (bottom left of Fig 1) displays tracks of information gathered from upstream pipelines and individual users’ analyses These provide the evidence (predictions and alignment) for refining genomic annotations. The reference sequence tab provides a sortable and searchable list of every scaffold, including the length, name, and number of annotations on each, for navigation across the genome

Design and implementation
Results
37. Genome Decoders
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call