Geraniaceae plastid genomes (plastomes) have experienced a remarkable number of genomic changes. The plastomes of Erodium texanum, Geranium palmatum, and Monsonia speciosa were sequenced and compared with other rosids and the previously published Pelargonium hortorum plastome. Geraniaceae plastomes were found to be highly variable in size, gene content and order, repetitive DNA, and codon usage. Several unique plastome rearrangements include the disruption of two highly conserved operons (S10 and rps2-atpA), and the inverted repeat (IR) region in M. speciosa does not contain all genes in the ribosomal RNA operon. The sequence of M. speciosa is unusually small (128,787 bp); among angiosperm plastomes sequenced to date, only those of nonphotosynthetic species and those that have lost one IR copy are smaller. In contrast, the plastome of P. hortorum is the largest, at 217,942 bp. These genomes have experienced numerous gene and intron losses and partial and complete gene duplications. Some of the losses are shared throughout the family (e.g., trnT-GGU and the introns of rps16 and rpl16); however, other losses are homoplasious (e.g., trnG-UCC intron in G. palmatum and M. speciosa). IR length is also highly variable. The IR in P. hortorum was previously shown to be greatly expanded to 76 kb, and the IR is lost in E. texanum and reduced in G. palmatum (11 kb) and M. speciosa (7 kb). Geraniaceae plastomes contain a high frequency of large repeats (>100 bp) relative to other rosids. Within each plastome, repeats are often located at rearrangement end points and many repeats shared among the four Geraniaceae flank rearrangement end points. GC content is elevated in the genomes and also in coding regions relative to other rosids. Codon usage per amino acid and GC content at third position sites are significantly different for Geraniaceae protein-coding sequences relative to other rosids. Our findings suggest that relaxed selection and/or mutational biases lead to increased GC content, and this in turn altered codon usage. We propose that increases in genomic rearrangements, repetitive DNA, nucleotide substitutions, and GC content may be caused by relaxed selection resulting from improper DNA repair.
Read full abstract