Abstract

In the early stage of the C. elegans sequencing project, the ab initio gene prediction program Genefinder was used to find protein-coding genes. Subsequently, protein-coding genes structures have been actively curated by WormBase using evidence from all available data sources. Most coding loci were identified by the Genefinder program, but the process of gene curation results in a continual refinement of the details of gene structure, involving the correction and confirmation of intron splice sites, the addition of alternate splicing forms, the merging and splitting of incorrect predictions, and the creation and extension of 5' and 3' ends. The development of new technologies results in the availability of further data sources, and these are incorporated into the evidence used to support the curated structures. Non-coding genes are more difficult to curate using this methodology, and so the structures for most of these have been imported from the literature or from specialist databases of ncRNA data. This article describes the structure and curation of transcribed regions of genes.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.