Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome.

Shuba Gopal,Roberto Sanchez,Mark Schroeder,Andrej Sali,Alexander Sczyrba,Stefan Bekiranov,J Eduardo Fajardo,Gulriz Aytekin-Kurban,Terry Gaasterland,Narayanan Eswar,Ursula Pieper

doi:10.1038/85922

Abstract

The approach to annotating a genome critically affects the number and accuracy of genes identified in the genome sequence. Genome annotation based on stringent gene identification is prone to underestimate the complement of genes encoded in a genome. In contrast, over-prediction of putative genes followed by exhaustive computational sequence, motif and structural homology search will find rarely expressed, possibly unique, new genes at the risk of including non-functional genes. We developed a two-stage approach that combines the merits of stringent genome annotation with the benefits of over-prediction. First we identify plausible genes regardless of matches with EST, cDNA or protein sequences from the organism (stage 1). In the second stage, proteins predicted from the plausible genes are compared at the protein level with EST, cDNA and protein sequences, and protein structures from other organisms (stage 2). Remote but biologically meaningful protein sequence or structure homologies provide supporting evidence for genuine genes. The method, applied to the Drosophila melanogaster genome, validated 1,042 novel candidate genes after filtering 19,410 plausible genes, of which 12,124 matched the original 13,601 annotated genes. This annotation strategy is applicable to genomes of all organisms, including human.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome.

Abstract

Talk to us

Similar Papers

More From: Nature Genetics

Lead the way for us

Journal: Nature Genetics	Publication Date: Mar 1, 2001
Citations: 76

Similar Papers

A hybrid approach for indexing and searching protein structures

WSEAS Transactions on Computers archive | VOL. 8

01 Jun 2009
WSEAS Transactions on Computers archive | VOL. 8

Glossary
Fran Lewitter ... Janet M Thornton
Trends in Biotechnology | VOL. 16
Fran Lewitter, et. al.Fran Lewitter ... Janet M Thornton
01 Nov 1998
Trends in Biotechnology | VOL. 16

3DFI: a pipeline to infer protein function using structural homology.
Alexander Thomas Julian ... Jean-François Pombert
Bioinformatics advances | VOL. 1
Alexander Thomas Julian, et. al.Alexander Thomas Julian ... Jean-François Pombert
09 Jun 2021
Bioinformatics advances | VOL. 1

Heuristic Methods for Finding Pathogenic Variants in Gene Coding Sequences
Monique Ohanian ... Diane Fatkin
Journal of the American Heart Association | VOL. 1
Monique Ohanian, et. al.Monique Ohanian ... Diane Fatkin
26 Sep 2012
Journal of the American Heart Association | VOL. 1

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Homology-based annotation yields 1,042 new candidate genes in the Drosophila melanogaster genome.

Abstract

Talk to us

Similar Papers

More From: Nature Genetics