Abstract

The Genoscope annotation workflow for eukaryote genomes relies on evidence from ab initio gene models predictions combined with homology searches, using collections of expressed sequences - full length cDNAs, ESTs or massive-scale mRNA sequences from the same or closely related organisms – proteins or other genomic sequences. Global analysis of these drafts or complete sequences are then combining both approaches in the form of gene prediction data integration using GAZE, capable to identify a majority of the existing gene features. Although of very good quality, gene-modelling remains still tentative at the end of the process. Even though computational predictors are useful on large scale annotation for global genomics analysis, there is no complete genome for which all gene structures, in terms of exons, introns and coding regions, have been experimentally confirmed.Finished genomes can provide exciting insights into the genome organization and evolution. Additional experimental data generated by genome sequencing projects give assistance to genome annotation aiming to a better understanding of the biology of the organism. Therefore, gene models and annotation can be improved by human curation to find errors or to resolve incongruous evidence on the automatic annotation of the genome. We now provide to collaborators carrying sequencing projects with a distributed annotation platform allowing expert evaluation of the annotation, in addition to our automated gene prediction pipeline.To ensure at most the participation of the scientific community, an annotation tool for revising annotations has been set up using components of the Generic Model Organism Database toolkit, which provides tools for managing organism databases. A CHADO database, linked to an Apollo graphical interface, permit users to correct gene structures and store them in a dedicated organism database, as we will show on a few examples. Such a tool would facilitate connecting and comparing predicted annotations with existing biological data, becoming the repository of complete annotated finished genome sequence.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.