Abstract

This study focused on the identification and phylogenetic analysis of glycine-rich RNA binding proteins that contain an RNA recognition motif (RRM)-type RNA binding domain in addition to a region with contiguous glycine residues in representative plant species. In higher plants, glycine-rich proteins with an RRM have met considerable interest as they are responsive to environmental cues and play a role in cold tolerance, pathogen defense, flowering time control, and circadian timekeeping. To identify such RRM containing proteins in plant genomes we developed an RRM profile based on the known glycine-rich RRM containing proteins in the reference plant Arabidopsis thaliana. The application of this remodeled RRM profile that omitted sequences from non-plant species reduced the noise when searching plant genomes for RRM proteins compared to a search performed with the known RRM_1 profile. Furthermore, we developed an island scoring function to identify regions with contiguous glycine residues, using a sliding window approach. This approach tags regions in a protein sequence with a high content of the same amino acid, and repetitive structures score higher. This definition of repetitive structures in a fixed sequence length provided a new glance for characterizing patterns which cannot be easily described as regular expressions. By combining the profile-based domain search for well-conserved regions (the RRM) with a scoring technique for regions with repetitive residues we identified groups of proteins related to the A. thaliana glycine-rich RNA binding proteins in eight plant species.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call