Chemical shifts are receiving renewed attention in structural biology owing to the recent introduction of novel methodologies that enable their use in protein structure determination. As these approaches have so far been mostly concerned with backbone atoms, it would be highly desirable to further generalize them to also include side-chain atoms. A major motivation for this objective is that side chains play crucial roles in determining the conformational properties of protein surfaces and interior cavities, which in most cases define the specificity of biomolecular interactions. In particular, aromatic side chains are capable of forming interactions with a variety of chemical groups through hydrophobic, p–p stacking, p–anion and p–cation interactions, and often comprise the hot spots of protein–protein and protein–ligand complex formation, and protein folding. Furthermore, aromatic side chains, as sources of ring current effects, substantially influence the chemical shifts of other nuclei, including the highly exploited backbone nuclei. However, although ring-current terms are frequently included in chemical shift predictions of backbone nuclei, aromatic chemical shifts are not normally used to define the geometry of the aromatic rings themselves. Recent advances in specific labeling technologies for aromatic side chains will soon increase the number of assigned aromatic chemical shifts, thus adding new prospects to the established methodology of aromatic chemical shift measurements. The incorporation of chemical shifts of aromatic side chains in structure-determination algorithms, in addition to the backbone atoms, would make it possible to extend the use of chemical shifts in structural studies. To achieve this goal, a chemical shift prediction method for side-chain nuclei that is based solely on the configurations of proximal atoms needs to be developed. This type of predictions, which is at variance with other currently available chemical shift predictors that provide chemical shift evaluations for side-chain nuclei, are readily differentiable with respect to the atomic coordinates, and thus enable the calculation of biasing forces for the integration of the equations of motion within a molecular dynamics scheme. Prediction of aromatic side-chain chemical shifts by differentiable functions opens new opportunities to monitor a range of important processes, and will increase the scope of chemical shift usage in determining the structures of biomolecular complexes and complex biomolecular systems. To address this challenge, we present here ArShift, a chemical shift prediction method for protein side-chain aromatic H nuclei. We then demonstrate that by using only aromatic side-chain chemical shifts, structures that do not match the state from which chemical shifts are measured can be revealed. The ArShift predictions are based on known phenomenological terms that describe the effects of ring current, magnetic anisotropy, and electric field terms, which are accompanied by a set of dihedral angle terms and distance-based polynomials (see the Supporting Information). A comprehensive analysis of the aromatic chemical shift assignments available from the BMRB database is used after filtering and re-referencing steps to reduce the number of inaccurate and artifactual entries (Figures S1 and S2 in the Supporting Information). To identify the mapping between chemical shifts and structures, only structures determined by X-ray crystallography at a resolution of 2.0 or better are considered in the derivation of the geometric terms. The combination of terms used in the predictions is then optimized through a Monte Carlo approach to decrease the number of fitted coefficients, thus increasing the significance of the remaining ones (Table S1). We assessed the accuracy of the prediction method by performing individual predictions (in leave-one-out tests) for all the chemical shift entries used for deriving the coefficients. The standard deviations of the residual errors (denoted here as standard errors) for the models implemented in the ArShift package are 0.189, 0.204, 0.256, 0.191, and 0.173 ppm for PheHd, Phe-He, Phe-Hz, Tyr-Hd, and Tyr-He nuclei, respectively (Figures S3 and S4). The comparison of the ArShift standard errors and the standard deviations of the corresponding chemical shift types in the BMRB database are presented in Figure 1. Predictions for C nuclei are not reported in this work because they do not currently provide a significant improvement over those based on the average values derived from the BMRB database. The reason for this situation is most probably the neglect of the stronger isotope effects on C [*] A. B. Sahakyan, Dr. A. Cavalli, Prof. M. Vendruscolo Department of Chemistry, University of Cambridge Lensfield Road, Cambridge CB2 1EW (UK) E-mail: mv245@cam.ac.uk Dr. W. F. Vranken European Bioinformatics Institute Wellcome Trust Genome Campus Cambridge CB10 1SD (UK) [] Current Address: Structural Biology Brussels Vrije Universiteit Brussel Pleinlaan 2, 1050 Brussel (Belgium)
Read full abstract