In the evolutionary studies of proteins, the average effect of natural selection operating on amino acid mutations may be examined by comparing the numbers of synonymous (dS) and nonsynonymous (dN) substitutions that have accumulated during the same time period. In this method, destabilizing mutations occurring across protein molecules may interfere with detection of natural selection, particularly positive selection, operating on other mutations. Here an attempt to detect positive selection eliminating effects of structural constraints is demonstrated using hemagglutinin (HA) of H3N2 human influenza A virus as an example. Compatible and incompatible amino acids were inferred at each site from the computational analysis of three-dimensional structure using the thermodynamic stability as an indicator, and natural selection was examined by comparing dS and dN among compatible amino acids. In the analysis of 2701 nucleotide sequences for the entire coding region of HA, the new method identified twice as many positively selected amino acid sites as the ordinary method (16 and 4 sites in the former method without and with correction for multiple testing, respectively, and 8 and 2 sites in the latter method). Positively selected sites were involved in epitopes, receptor-binding pocket, epistasis, and stabilization, which appeared to be biologically reasonable. Nevertheless, there still appeared to be several problems, which may largely render this method conservative. It may be effective to analyze many densely sampled sequences in this method.
Read full abstract