Abstract We use the statistical approach of random matrix and network theory to tackle the problem of identifying the important motifs responsible for the crucial functioning of the IGPD protein family. This addresses directly the question of patterns of interaction between amino acid residues (based on properties) in proteins that contribute to protein function. We use the mathematical tools of inverse participation ratio and Shannon entropy to determine the locations of the important groups of correlated amino acid positions which gives us the structural sites of the IGPD protein. These tools isolate the smallest eigenvalues/outliers corresponding to eigenmodes as the most localized which give the crucial sites for the family. We also create the threshold network of the IGPD protein and find that at certain threshold, similar sites emerge from the network analysis which in addition gives us the strongest connected sites. This strengthens our method of finding the structural and functional sites. As a bonus we find these important sites also match with experiments.
Read full abstract