Abstract

Using the rice (Oryza sativa) sp. japonica genome annotation, along with genomic sequence and clustered transcript assemblies from 184 species in the plant kingdom, we have identified a set of 861 rice genes that are evolutionarily conserved among six diverse species within the Poaceae yet lack significant sequence similarity with plant species outside the Poaceae. This set of evolutionarily conserved and lineage-specific rice genes is termed conserved Poaceae-specific genes (CPSGs) to reflect the presence of significant sequence similarity across three separate Poaceae subfamilies. The vast majority of rice CPSGs (86.6%) encode proteins with no putative function or functionally characterized protein domain. For the remaining CPSGs, 8.8% encode an F-box domain-containing protein and 4.5% encode a protein with a putative function. On average, the CPSGs have fewer exons, shorter total gene length, and elevated GC content when compared with genes annotated as either transposable elements (TEs) or those genes having significant sequence similarity in a species outside the Poaceae. Multiple sequence alignments of the CPSGs with sequences from other Poaceae species show conservation across a putative domain, a novel domain, or the entire coding length of the protein. At the genome level, syntenic alignments between sorghum (Sorghum bicolor) and 103 of the 861 rice CPSGs (12.0%) could be made, demonstrating an additional level of conservation for this set of genes within the Poaceae. The extensive sequence similarity in evolutionarily distinct species within the Poaceae family and an additional screen for TE-related structural characteristics and sequence discounts these CPSGs as being misannotated TEs. Collectively, these data confirm that we have identified a specific set of genes that are highly conserved within, as well as specific to, the Poaceae.

Highlights

  • Using the rice (Oryza sativa) sp. japonica genome annotation, along with genomic sequence and clustered transcript assemblies from 184 species in the plant kingdom, we have identified a set of 861 rice genes that are evolutionarily conserved among six diverse species within the Poaceae yet lack significant sequence similarity with plant species outside the Poaceae

  • Pair-wise sequence comparisons with the 42,653 The Institute for Genomic Research (TIGR) Version 4 nonTE rice genes were performed with plant genomic sequences and TIGR Version 1 plant transcript assemblies (TAs) using TBLASTN (Altschul et al, 1997; Childs et al, 2007; Ouyang et al, 2007; http://plantta.tigr.org)

  • TBLASTN was performed with the rice non-transposable elements (TEs) genes separately against five genomic sequences: (1) a finished genomic sequence for Arabidopsis (Arabidopsis Genome Initiative, 2000); (2) bacterial artificial chromosome (BAC)-based sequence assemblies and annotation for a model species in the Fabaceae, Medicago (Medicago truncatula; Cannon et al, 2005; Town, 2006); (3) hi-Cot and methylation filtration genomic assemblies for maize (Zea mays; AZMs; Palmer et al, 2003; Whitelaw et al, 2003; Yuan et al, 2003; Chan et al, 2006); (4) methylation filtration genomic assemblies for sorghum (Sorghum bicolor; ASBs; Bedell et al, 2005; ftp://ftp.tigr.org/pub/data/MAIZE/Sorghum_assembly/ASB.gz); and (5) whole-genome shotgun assemblies for the model species in the Salicaceae, poplar (Populus trichocarpa; Tuskan et al, 2006)

Read more

Summary

Introduction

Using the rice (Oryza sativa) sp. japonica genome annotation, along with genomic sequence and clustered transcript assemblies from 184 species in the plant kingdom, we have identified a set of 861 rice genes that are evolutionarily conserved among six diverse species within the Poaceae yet lack significant sequence similarity with plant species outside the Poaceae. Using these previous comparative analyses as a guide, our analysis has incorporated the finished rice genome sequence and its annotation in combination with the genomic sequence and EST resources present for 184 evolutionarily diverse species in the plant kingdom to define and characterize a set of genes conserved within, as well as specific to, the Poaceae. This set of 861 rice genes has been termed conserved Poaceae-specific genes (CPSGs). This broad evolutionary conservation across the Poaceae indicates that the CPSGs are not artifacts of annotation or unclassified transposable elements (TEs; Bennetzen et al, 2004; Kellogg and Bennetzen, 2004); rather, they represent a bona fide set of lineagespecific genes and largely lack any known function

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call