Abstract

Pentatricopeptide repeat (PPR) proteins are the largest known RNA-binding protein family, and are found in all eukaryotes, being particularly abundant in higher plants. PPR proteins localize mostly to mitochondria and chloroplasts, and many were shown to modulate organellar genome expression on the posttranscriptional level. Although the genomes of land plants encode hundreds of PPR proteins, only a few have been identified in Fungi and Metazoa. As the current PPR motif profiles are built mainly on the basis of the predominant plant sequences, they are unlikely to be optimal for detecting fungal and animal members of the family, and many putative PPR proteins in these genomes may remain undetected. In order to verify this hypothesis, we designed a hidden Markov model-based bioinformatic tool called Supervised Clustering-based Iterative Phylogenetic Hidden Markov Model algorithm for the Evaluation of tandem Repeat motif families (SCIPHER) using sequence data from orthologous clusters from available yeast genomes. This approach allowed us to assign 12 new proteins in Saccharomyces cerevisiae to the PPR family. Similarly, in other yeast species, we obtained a 5-fold increase in the detection of PPR motifs, compared with the previous tools. All the newly identified S. cerevisiae PPR proteins localize in the mitochondrion and are a part of the RNA processing interaction network. Furthermore, the yeast PPR proteins seem to undergo an accelerated divergent evolution. Analysis of single and double amino acid substitutions in the Dmr1 protein of S. cerevisiae suggests that cooperative interactions between motifs and pseudoreversion could be the force driving this rapid evolution.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call