A good scoring function is necessary for ab inito prediction of RNA tertiary structures. In this study, we explored the power of a machine learning based approach as a scoring function. Compared with the traditional scoring functions, the present approach is more flexible in incorporating different kinds of features; it is also free of the difficult problem of choosing the reference state. Two multi-layer neural networks were constructed and trained. They took RNA a structural candidate as input and then output its likeness score that evaluates the likeness of the candidate to the native structure. The first network was working at the coarse-grained level of RNA structures, while the second at the all-atom level. We also built an RNA database and split it into the training, validation, and testing sets, containing 322, 70, and 70 RNAs, respectively. Each RNA was accompanied with 300 decoys generated by high-temperature molecular dynamics simulations. The networks were trained on the training set and then optimized with an early-stop strategy, based on the loss of the validation set. We then tested the performance of the networks on the testing set. The results were found to be consistently better than a recent knowledge-based all-atom potential.
Read full abstract