Multi-models in predicting RNA solvent accessibility exhibit the contribution from none-sequential attributes and providing a globally stable modeling strategy

Yuyao Huang,Yizhou Li,Yuan Liu,Xingyong Zhu,Runyu Jing,Menglong Li

doi:10.1016/j.chemolab.2020.104100

Abstract

Recently, multiple researches of solvent accessibility on protein are well established. However, research on solvent accessibility of RNA faces a few challenges such as instability and diversity of RNA tertiary structure. Nowadays, no study has examined the predicting performance from different datasets built from multi-models and multi-attributes, but it is an important part of measuring the overall performance of modeling. Therefore, we performed a comprehensive comparison for predicting RNA solvent accessibility based on two datasets. 15,923 (12,229 + 3694) samples and 512(336 + 176) attributes were generated for attribute selection, finally 336 models were built for predicting. 12 modeling methods and 2 attribute selection methods were used for modeling and evaluating. This work provided a strategy for getting stable expectation when predicting RNA solvent accessibility. These results would be useful in further experimental or computational design for RNA solvent accessibility predicting and we hope this work could help the related researches which need a workflow for stable prediction.

Full Text