Single nucleotide polymorphisms (SNPs) are a major contributor to genetic and phenotypic variation within populations. Non-synonymous SNPs (nsSNPs) modify the sequence of proteins and can affect their folding or binding properties. Experimental analysis of all nsSNPs is currently unfeasible and therefore computational predictions of the molecular effect of nsSNPs are helpful to guide experimental investigations. While some nsSNPs can be accurately characterized, for instance if they fall into strongly conserved or well annotated regions, the molecular consequences of many others are more challenging to predict. In particular, nsSNPs affecting less structured, and often less conserved regions, are difficult to characterize. Binding sites that mediate protein-protein or other protein interactions are an important class of functional sites on proteins and can be used to help interpret nsSNPs. Binding sites targeted by the PDZ modular peptide recognition domain have recently been characterized. Here we use this data to show that it is possible to computationally identify nsSNPs in PDZ binding motifs that modify or prevent binding to the proteins containing the motifs. We confirm these predictions by experimentally validating a selected subset with ELISA. Our work also highlights the importance of better characterizing linear motifs in proteins as many of these can be affected by genetic variations.
Read full abstract