Abstract

The SARS-CoV-2 S protein is a major point of interaction between the virus and the human immune system. As a consequence, the S protein is not a static target but undergoes rapid molecular evolution. In order to more fully understand the selection pressure during evolution, we examined residue positions in the S protein that vary greatly across closely related viruses but are conserved in the subset of viruses that infect humans. These “evolutionarily important” residues were not distributed evenly across the S protein but were concentrated in two domains: the N-terminal domain and the receptor-binding domain, both of which play a role in host cell binding in a number of related viruses. In addition to being localized in these two domains, evolutionary importance correlated with structural flexibility and inversely correlated with distance from known or predicted host receptor-binding residues. Finally, we observed a bias in the composition of the amino acids that make up such residues toward more human-like, rather than virus-like, sequence motifs.

Highlights

  • Over 200 viruses are known to infect humans (Woolhouse et al, 2012)

  • The S1 subunit is located within the N-terminus of the S protein and can be further divided into an N-terminal domain (NTD) and a C-terminal domain, which, in itself, can be divided into a receptor-binding domain (RBD) located at the apex of the protein when viewed from the side and two additional domains

  • We found that evolutionary importance was high in the NTD and RBD

Read more

Summary

INTRODUCTION

Over 200 viruses are known to infect humans (Woolhouse et al, 2012). Among recent human virus outbreaks, three (SARS-CoV-1, MERS-CoV, and SARS-CoV-2) have arisen from beta coronaviruses. In order to better understand this evolutionary pressure, we estimated the evolutionary importance of residue positions in SARS-CoV-2 by comparing the amino acid diversity of each position to that of equivalent positions in closely related viruses that infect non-human hosts. A high evolutionary rate in this lineage was clearly observed near hACE2-binding sites (Figure 1E) This observation is consistent with the site-specific diversity observed in the evolutionary importance; such sites have apparently changed radically upon transfer to humans and have been highly conserved thereafter. Restricted to two domains, the NTD and RBD, both of which have host receptor-binding functions in a number of closely related viruses These “important” residues were more flexible than less important residues, suggesting that the flexibility is a characteristic of rapid molecular evolution. The sequence data on SARS-CoV-2 is still limited, the patterns may provide clues about the identity of targeted human cell surface receptors

DATA AVAILABILITY STATEMENT
DISCUSSION
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call