Abstract

The medically relevant field of protein-based therapeutics has triggered a demand for protein engineering in different pH environments of biological relevance. In silico engineering workflows typically employ high-throughput screening campaigns that require evaluating large sets of protein residues and point mutations by fast yet accurate computational algorithms. While several high-throughput pKa prediction methods exist, their accuracies are unclear due to the lack of a current comprehensive benchmarking. Here, seven fast, efficient, and accessible approaches including PROPKA3, DeepKa, PKAI, PKAI+, DelPhiPKa, MCCE2, and H++ were systematically tested on a nonredundant subset of 408 measured protein residue pKa shifts from the pKa database (PKAD). While no method outperformed the null hypotheses with confidence, as illustrated by statistical bootstrapping, DeepKa, PKAI+, PROPKA3, and H++ had utility. More specifically, DeepKa consistently performed well in tests across multiple and individual amino acid residue types, as reflected by lower errors, higher correlations, and improved classifications. Arithmetic averaging of the best empirical predictors into simple consensuses improved overall transferability and accuracy up to a root-mean-square error of 0.76 pKa units and a correlation coefficient (R2) of 0.45 to experimental pKa shifts. This analysis should provide a basis for further methodological developments and guide future applications, which require embedding of computationally inexpensive pKa prediction methods, such as the optimization of antibodies for pH-dependent antigen binding.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call