Abstract

Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands “gene insulae” to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.

Highlights

  • Genes in grass genomes are separated from each other by intergenic regions often containing transposable elements (TEs) [1]

  • In B. distachyon and rice, the nonparametric density functions do not significantly differ from the fitted exponential density for those species (Figures 1A and 1B and Table 1) and indicate that genes are not clustered in these species

  • We propose this new term because the English term ‘‘gene islands’’ has been used for other forms of gene clustering

Read more

Summary

Introduction

Genes in grass genomes are separated from each other by intergenic regions often containing transposable elements (TEs) [1]. The balance between the rate with which TEs are inserted into an intergenic region and the rate with which they are deleted determines whether the region is expanding or contracting [1,2,3,4] Changes in this balance are almost certainly the primary cause of variation in genome size and gene density along chromosomes. Gene density in many grass genomes increases overall from the proximal towards the distal regions of chromosome arms. This gradient is heterogeneous and regions of higher and lower gene density are superimposed on it. This pattern of gene distribution has been observed in all grass genomes sequenced to date [5,6,7,8,9,10]

Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call