Abstract
In the post-genome era, insufficient functional annotation of predicted genes greatly restricts the potential of mining genome data. We demonstrate that an evolutionary approach, which is independent of functional annotation, has great potential as a tool for genome analysis. We chose the genome of a model filamentous fungus Neurospora crassa as an example. Phylogenetic distribution of each predicted protein coding gene (PCG) in the N. crassa genome was used to classify genes into six mutually exclusive lineage specificity (LS) groups, i.e. Eukaryote/Prokaryote-core, Dikarya-core, Ascomycota-core, Pezizomycotina-specific, N. crassa-orphans and Others. Functional category analysis revealed that only ∼23% of PCGs in the two most highly lineage-specific grouping, Pezizomycotina-specific and N. crassa-orphans, have functional annotation. In contrast, ∼76% of PCGs in the remaining four LS groups have functional annotation. Analysis of chromosomal localization of N. crassa-orphan PCGs and genes encoding for secreted proteins showed enrichment in subtelomeric regions. The origin of N. crassa-orphans is not known. We found that 11% of N. crassa-orphans have paralogous N. crassa-orphan genes. Of the paralogous N. crassa-orphan gene pairs, 33% were tandemly located in the genome, implying a duplication origin of N. crassa-orphan PCGs in the past. LS grouping is thus a useful tool to explore and understand genome organization, evolution and gene function in fungi.
Highlights
A windfall of fungal genome sequences has been made available in the past ten years; at present, the genome sequences of,40 filamentous fungal species are available
Homologs of 145 N. crassa protein coding gene (PCG) were found in species in the Saccharomycotina (e.g. Saccharomyces cerevisiae) and/or Taphrinomycotina (e.g. Schizosaccharomyces pombe), but homologs were not identified in non-Ascomycota fungi
For 2,219 of the 9,127 PCGs predicted in the N. crassa genome, homologous genes were not identified in any other genome; these were defined as N. crassa-orphans
Summary
A windfall of fungal genome sequences has been made available in the past ten years; at present, the genome sequences of ,40 filamentous fungal species are available. A large proportion of predicted genes in filamentous fungal genomes are annotated as unclassified genes (lacking functional annotation). The first genome that was sequenced from a filamentous fungus was that of an ascomycete species Neurospora crassa [1]; 56% of predicted protein coding genes (PCGs) lack functional annotation according to MIPS Neurospora crassa DataBase (http://mips.gsf.de/genre/proj/ ncrassa/) [2]. This problem prompted us to employ a bioinformatic tool for analysis of the N. crassa genome that does not rely on conventional approaches of functional annotation. We determined the LS of each N. crassa gene and classified them into six mutually exclusive LS groups using the SIMAP (similarity matrix of proteins) database [4,5]: (1) Eukaryote/ Prokaryote-core (genes with homologs in non-fungal eukaryotes and/or prokaryotes), (2) Dikarya-core (genes with homologs in Basidiomycota and Ascomycota species), (3) Ascomycota-core (4) Pezizomycotina-specific, (5) N. crassa-orphan genes and (6) Others (gene homologs identified in prokaryotes or non-fungal eukaryotes in addition to Pezizomycotina, but not in members of the Basidiomycota, Saccharomycotina or Taphrinomycotina)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.