RNAPosers: Machine Learning Classifiers for Ribonucleic Acid-Ligand Poses.

Sahil Chhabra,Jingru Xie,Aaron T. Frank

doi:10.1021/acs.jpcb.0c02322

Abstract

Determining the three-dimensional (3D) structures of ribonucleic acid (RNA)-small molecule ligand complexes is critical to understanding molecular recognition in RNA. Computer docking can, in principle, be used to predict the 3D structure of RNA-small molecule complexes. Unfortunately, retrospective analysis has shown that the scoring functions that are typically used for pose prediction tend to misclassify non-native poses as native and vice versa. Here, we use machine learning to train a set of pose classifiers that estimate the relative "nativeness" of a set of RNA-ligand poses. At the heart of our approach is the use of a pose "fingerprint" (FP) that is a composite of a set of atomic FPs, which individually encode the local "RNA environment" around ligand atoms. We found that by ranking poses based on classification scores from our machine learning classifiers, we were able to recover native-like poses better than when we ranked poses based on their docking scores. With a leave-one-out training and testing approach, we found that one of our classifiers could recover poses that were within 2.5 Å of the native poses in ∼80% of the 80 cases we examined, and, on two separate validation sets, we could recover such poses in ∼60% of the cases. Our set of classifiers, which we refer to as RNAPosers, should find utility as a tool to aid in RNA-ligand pose prediction, and so we make RNAPosers open to the academic community via https://github.com/atfrank/RNAPosers.

Full Text