Abstract
SARS-CoV-2 is mutating and creating divergent variants across the world. An in-depth investigation of the amino acid substitutions in the genomic signature of SARS-CoV-2 proteins is highly essential for understanding its host adaptation and infection biology. A total of 9587 SARS-CoV-2 structural protein sequences collected from 49 different countries are used to characterize protein-wise variants, substitution patterns (type and location), and major substitution changes. The majority of the substitutions are distinct, mostly in a particular location, and lead to a change in an amino acid's biochemical properties. In terms of mutational changes, envelope (E) and membrane (M) proteins are relatively more stable than nucleocapsid (N) and spike (S) proteins. Several co-occurrence substitutions are observed, particularly in S and N proteins. Substitution specific to active sub-domains reveals that heptapeptide repeat, fusion peptides, transmembrane in S protein, and N-terminal and C-terminal domains in the N protein are remarkably mutated. We also observe a few deleterious mutations in the above domains. The overall study on non-synonymous mutation in structural proteins of SARS-CoV-2 at the start of the pandemic indicates a diversity amongst virus sequences.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have