Abstract
N7-methylguanosine (m7G) is an essential, ubiquitous, and positively charged modification at the 5′ cap of eukaryotic mRNA, modulating its export, translation, and splicing processes. Although several machine learning (ML)-based computational predictors for m7G have been developed, all utilized specific computational framework. This study is the first instance we explored four different computational frameworks and identified the best approach. Based on that we developed a novel predictor, THRONE (A three-layer ensemble predictor for identifying human RNA N7-methylguanosine sites) to accurately identify m7G sites from the human genome. THRONE employs a wide range of sequence-based features inputted to several ML classifiers and combines these models through ensemble learning. The three-step ensemble learning is as follows: 54 baseline models were constructed in the first layer and the predicted probability of m7G was considered as a new feature vector for the sequential step. Subsequently, six meta-models were created using the new feature vector and their predicted probability was yet again considered as novel features. Finally, random forest was deemed as the best super classifier learner for the final prediction using a systematic approach incorporated with novel features. Interestingly, THRONE outperformed other existing methods in the prediction of m7G sites on both cross-validation analysis and independent evaluation. The proposed method is publicly accessible at: http://thegleelab.org/THRONE/ and expects to help the scientific community identify the putative m7G sites and formulate a novel testable biological hypothesis.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.