Abstract

Accurately targeting metal ion-binding sites solely from protein sequences is valuable for both basic experimental biology and drug discovery studies. Although considerable progress has been made, metal ion-binding site prediction is still a challenging problem due to the small size and high versatility of the metal ions. In this paper, we develop a ligand-specific predictor called MIonSite for predicting metal ion-binding sites from protein sequences. MIonSite first employs protein evolutionary information, predicted secondary structure, predicted solvent accessibility, and conservation information calculated by Jensen-Shannon Divergence score to extract the discriminative feature of each residue. An enhanced AdaBoost algorithm is then designed to cope with the serious imbalance problem buried in the metal ion-binding site prediction, where the number of non-binding sites is far more than that of metal ion-binding sites. A new gold-standard benchmark dataset, consisting of training and independent validation subsets of Zn2+, Ca2+, Mg2+, Mn2+, Fe3+, Cu2+, Fe2+, Co2+, Na+, K+, Cd2+, and Ni2+, is constructed to evaluate the proposed MIonSite with other existing predictors. Experimental results demonstrate that the proposed MIonSite achieves high prediction performance and outperforms other state-of-the-art sequence-based predictors. The standalone program of MIonSite and corresponding datasets can be freely downloaded at https://github.com/LiangQiaoGu/MIonSite.git for academic use.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.