Abstract

The GC content, one of the important compositional features of the genome, varies significantly among different genomes and different regions within a genome. Identifying the driving force that shaped the GC content and deciphering the biological meaning of variations in the GC content will help us to understand genome evolution. We analyzed and compared the GC contents of 20 selected plant species, representing the major evolutionary lineages. Our result revealed the highest GC content and GC heterogeneity in the grass genomes followed by the non-grass monocot and dicot genomes. The detailed analysis of GC content in genic regions showed higher GC content in terminal exons than in internal exons in all selected species except Volvox carteri. A strong correlation between the GC contents of exons and their neighboring introns at terminals of genes was observed in all the grasses, Musa acuminata, Spirodela polyrhiza and Nelumbo nucifera genomes. Our result suggested that the widely reported negative gradient of GC3 along the coding sequences from 5′ to 3′ was likely an artifact caused by GC content calculations on an admixture of genes with variable lengths and exon numbers. Our findings supported the role of the GC biased gene conversion in shaping the nucleotide composition landscapes in monocots. The U shape pattern of the GC content along the genes may have resulted from variable degrees of interactions among transcription, replication and DNA repair machineries. The transcription-associated recombination might play a major role in GC content evolution.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call