Abstract

Analysis of epitranscriptomic RNA modifications by deep sequencing-based approaches brings an essential contribution to the general knowledge on their precise locations and relative stoichiometry in cellular RNAs. To reveal RNA modifications, several analytical approaches have been proposed, including antibody-driven enrichment, analysis of RT-signatures and specific chemical treatments. However, analysis and interpretation of these massive datasets, especially for low abundant cellular RNAs (e.g. mRNA and lncRNA) is not easy nor straightforward, since the insufficient specificity and selectivity are leading to massive false-positive and false-negative identifications. The main issue in the application of these methods relies on a subjective classification of potentially modified positions, mostly based on arbitrarily defined threshold values for different scores. Such approach using pre-defined scores' values was revealed to be appropriate for limited complexity datasets (for tRNA and/or rRNA analysis), but application to longer reference sequences requires much better classification algorithms. In this work we applied a machine learning algorithm (Random Forest, RF) to create a predictive model for analysis of 2'-O-methylated sites in RNA using RiboMethSeq datasets. Model's training was performed on a large collection of human rRNA datasets with well-known modification profiles and the performance of the prediction was assessed using experimentally defined profiles for other eukaryotic rRNAs (S.cerevisiae and A.thaliana). Application of this Random Forest prediction model for detection of other RNA modifications and to more complex datasets is discussed.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.