Abstract

Genome annotation conceptually consists of inferring and assigning biological information to gene products. Over the years, numerous pipelines and computational tools have been developed aiming to automate this task and assist researchers in gaining knowledge about target genes of study. However, even with these technological advances, manual annotation or manual curation is necessary, where the information attributed to the gene products is verified and enriched. Despite being called the gold standard process for depositing data in a biological database, the task of manual curation requires significant time and effort from researchers who sometimes have to parse through numerous products in various public databases. To assist with this problem, we present CODON, a tool for manual curation of genomic data, capable of performing the prediction and annotation process. This software makes use of a finite state machine in the prediction process and automatically annotates products based on information obtained from the Uniprot database. CODON is equipped with a simple and intuitive graphic interface that assists on manual curation, enabling the user to decide about the analysis based on information as to identity, length of the alignment, and name of the organism in which the product obtained a match. Further, visual analysis of all matches found in the database is possible, impacting significantly in the curation task considering that the user has at his disposal all the information available for a given product. An analysis performed on eleven organisms was used to test the efficiency of this tool by comparing the results of prediction and annotation through CODON to ones from the NCBI and RAST platforms.

Highlights

  • The advent of DNA sequencing platforms provided a great advance in the deposit of biological information in public databases

  • The accuracy of genome annotation is directly impacted by the manual curation step since complementary information is added to gene products

  • We present the CODON software, which allows this process to be dynamic and significantly reduce the total work, the user responsible for conducting manual curation will be able to check the annotation of each product directly on the screen of his computer, without the need to search for the similarity of each open reading frame (ORF) in the external databases, it is possible to change the ORF annotation based on the information displayed on the screen, for example, percentage of identity and alignment length match

Read more

Summary

Author summary

The accuracy of genome annotation is directly impacted by the manual curation step since complementary information is added to gene products. In addition to providing the user with access to highly accurate database information, producing a result with more gene acronyms, metabolic pathway information, and Gene Ontology terms. This is a PLOS Computational Biology Software paper

Introduction
Design and implementation
Results
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.