Abstract

High quality gene models are necessary to expand the molecular and genetic tools available for a target organism, but these are available for only a handful of model organisms that have undergone extensive curation and experimental validation over the course of many years. The majority of gene models present in biological databases today have been identified in draft genome assemblies using automated annotation pipelines that are frequently based on orthologs from distantly related model organisms and usually have minor or major errors. Manual curation is time consuming and often requires substantial expertise, but is instrumental in improving gene model structure and identification. Manual annotation may seem to be a daunting and cost-prohibitive task for small research communities but involving undergraduates in community genome annotation consortiums can be mutually beneficial for both education and improved genomic resources. We outline a workflow for efficient manual annotation driven by a team of primarily undergraduate annotators. This model can be scaled to large teams and includes quality control processes through incremental evaluation. Moreover, it gives students an opportunity to increase their understanding of genome biology and to participate in scientific research in collaboration with peers and senior researchers at multiple institutions.

Highlights

  • This guide describes the workflow for a community genome annotation project that connects undergraduate students with bioinformaticians, faculty and peer mentors to foster educational development and produce quality student-driven annotation

  • The manual curation of gene families promotes a deeper understanding of underlying biology, a critical learning experience for undergraduate students

  • A community that consists of experts, instructors and peer mentors provides the ideal framework to train and supervise undergraduate students so they can make a meaningful contribution

Read more

Summary

Introduction

This guide describes the workflow for a community genome annotation project that connects undergraduate students with bioinformaticians, faculty and peer mentors to foster educational development and produce quality student-driven annotation. The manual curation of gene families promotes a deeper understanding of underlying biology, a critical learning experience for undergraduate students. We describe a workflow to establish curation resources, train undergraduate students, curate gene families, perform quality control and publish the results The success of this community annotation model is based on a roadmap that includes building a collaborative ecosystem (see 1.1–3), recruiting new students (see 2.1) and providing them with initial training followed by continued support (see 2.3 and 3.1). The student-driven annotation community model outlined here provides significant undergraduate-based educational opportunities that will yield a well-trained student population and provide the scientific research community with quality curated data sets

Build a collaborative ecosystem
Train annotators and formalize curation practices
Quality evaluation and publication
Structural assessment
Functional Curation
1-3 Iterative annotation with review
Publication
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.