Abstract

A computational analysis of RNA editing sites was performed on protein-coding sequences of plant mitochondrial genomes from Arabidopsis thaliana, Beta vulgaris, Brassica napus, and Oryza sativa. The distribution of nucleotides around edited and unedited cytidines was compared in 41 nucleotide segments and included 1481 edited cytidines and 21,390 unedited cytidines in the 4 genomes. The distribution of nucleotides was examined in 1, 2, and 3 nucleotide windows by comparison of nucleotide frequency ratios and relative entropy. The relative entropy analyses indicate that information is encoded in the nucleotide sequences in the 5 prime flank (-18 to -14, -13 to -10, -6 to -4, -2/-1) and the immediate 3 prime flanking nucleotide (+1), and these regions may be important in editing site recognition. The relative entropy was large when 2 or 3 nucleotide windows were analyzed, suggesting that several contiguous nucleotides may be involved in editing site recognition. RNA editing sites were frequently preceded by 2 pyrimidines or AU and followed by a guanidine (HYCG) in the monocot and dicot mitochondrial genomes, and rarely preceded by 2 purines. Analysis of chloroplast editing sites from a dicot, Nicotiana tabacum, and a monocot, Zea mays, revealed a similar distribution of nucleotides around editing sites (HYCA). The similarity of this motif around editing sites in monocots and dicots in both mitochondria and chloroplasts suggests that a mechanistic basis for this motif exists that is common in these different organelle and phylogenetic systems. The preferred sequence distribution around RNA editing sites may have an important impact on the acquisition of editing sites in evolution because the immediate sequence context of a cytidine residue may render a cytidine editable or uneditable, and consequently determine whether a T to C mutation at a specific position may be corrected by RNA editing. The distribution of editing sites in many protein-coding sequences is shown to be non-random with editing sites clustered in groups separated by regions with no editing sites. The sporadic distribution of editing sites could result from a mechanism of editing site loss by gene conversion utilizing edited sequence information, possibly through an edited cDNA intermediate.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call