Abstract

The landscape of plant genomes, while slowly being characterized and defined, is still composed primarily of regions of undefined function. Many eukaryotic genomes contain isochore regions, mosaics of homogeneous GC content that can abruptly change from one neighboring isochore to the next. Isochores are broken into families that are characterized by their GC levels. We identified 4,339 compositionally distinct domains and 331 of these were identified as long homogeneous genome regions (LHGRs). We assigned these to four families based on finite mixture models of GC content. We then characterized each family with respect to exon length, gene content, and transposable elements. The LHGR pattern of soybeans is unique in that while the majority of the genes within LHGRs are found within a single LHGR family with a narrow GC range (Family B), that family is not the highest in GC content as seen in vertebrates and invertebrates. Instead Family B has a mean GC content of 35%. The range of GC content for all LHGRs is 16–59% GC which is a larger range than what is typical of vertebrates. This is the first study in which LHGRs have been identified in soybeans and the functions of the genes within the LHGRs have been analyzed.

Highlights

  • The genomes of living organisms are often organized into unique patterns, the purposes of which are mostly unknown

  • We used a segmentation algorithm based on z curves to identify soybean long homogeneous genome regions (LHGRs)

  • When the slope of the curve is positive it is indicative of a decreasing GC content

Read more

Summary

Introduction

The genomes of living organisms are often organized into unique patterns, the purposes of which are mostly unknown. DNA fractionation by ultracentrifugation, cytogenetic analyses, and recently, analyses of genes and genome sequences, has been utilized to identify these regions. Gene density, replication timing, CpG distribution, genic size, and transcript abundance are several of the main features found to associate with isochores (Bernardi, 2004). In vertebrates, these regions have been mapped and named long homogeneous genome regions (LHGRs; Zhang et al, 2010). LHGRs have been identified in invertebrates (Cammarano et al, 2009). LHGR GC content is strongly conserved among invertebrate species

Objectives
Methods
Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call