Using three different methods we tried to compute 171 experimentally known pK(a) values of ionizable residues from 15 different proteins and compared the accuracies of computed pK(a) values in terms of the root mean square deviation (RMSD) from experiment. One method is based on a continuum electrostatic model of the protein including conformational flexibility (KBPLUS). The others are empirical approaches with PROPKA deploying physically motivated energy terms with adjustable parameters and PKAcal using an empirical function with no physical basis. PROPKA reproduced the pK(a) values with highest overall accuracy. Differentiating the data set into weakly and strongly shifted experimental pK(a) values, however, we found that PROPKA's accuracy is better if the pK(a) values are weakly shifted but on equal footing with that of KBPLUS for more strongly shifted values. On the other hand, PKAcal reproduces strongly shifted pK(a) values badly but weakly shifted values with the same accuracy as PROPKA. We tested different consensus approaches combining data from all three methods to find a general procedure for most accurate pK(a) predictions. In most of the cases we found that the consensus approach reproduced experimental data with better accuracy than any of the individual methods alone.
Read full abstract