Predicting the precise locations of metal binding sites within metalloproteins is a crucial challenge in biophysics. A fast, accurate, and interpretable computational prediction method can complement the experimental studies. In the current work, we have developed a method to predict the location of Ca2+ ions in calcium-binding proteins using a physics-based method with an all-atom description of the proteins, which is substantially faster than the molecular dynamics simulation-based methods with accuracy as good as data-driven approaches. Our methodology uses the three-dimensional reference interaction site model (3D-RISM), a statistical mechanical theory, to calculate Ca2+ ion density around protein structures, and the locations of the Ca2+ ions are obtained from the density. We have taken previously used datasets to assess the efficacy of our method as compared to previous works. Our accuracy is 88%, comparable with the FEATURE program, one of the well-known data-driven methods. Moreover, our method is physical, and the reasons for failures can be ascertained in most cases. We have thoroughly examined the failed cases using different structural and crystallographic measures, such as B-factor, R-factor, electron density map, and geometry at the binding site. It has been found that x-ray structures have issues in many of the failed cases, such as geometric irregularities and dubious assignment of ion positions. Our algorithm, along with the checks for structural accuracy, is a major step in predicting calcium ion positions in metalloproteins.
Read full abstract