Abstract

ABSTRACTSARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein). To delineate the genomic diversity in association with geographic dispersion of SARS-CoV-2 variant lineages, we collected 939,591 complete S protein sequences deposited in the Global Initiative on Sharing All Influenza Data (GISAID) from December 2019 to April 2021. An exponential emergence of S protein variants was observed since October 2020 when the four major variants of concern (VOCs), namely, alpha (α) (B.1.1.7), beta (β) (B.1.351), gamma (γ) (P.1), and delta (δ) (B.1.617), started to circulate in various communities. We found that residues 452, 477, 484, and 501, the 4 key amino acids located in the hACE2 binding domain of S protein, were under positive selection. Through in silico protein structure prediction and immunoinformatics tools, we discovered D614G is the key determinant to S protein conformational change, while variations of N439K, T478I, E484K, and N501Y in S1-RBD also had an impact on S protein binding affinity to hACE2 and antigenicity. Finally, we predicted that the yet-to-be-identified hypothetical N439S, T478S, and N501K mutations could confer an even greater binding affinity to hACE2 and evade host immune surveillance more efficiently than the respective native variants. This study documented the evolution of SARS-CoV-2 S protein over the first 16 months of the pandemic and identified several key amino acid changes that are predicted to confer a substantial impact on transmission and immunological recognition. These findings convey crucial information to sequence-based surveillance programs and the design of next-generation vaccines.

Highlights

  • SARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein)

  • Based on dN/dS ratio and key amino acids involved in immune evasion, we focused on L5, L18, S98, and A222 (S1-NTD), N439, L452, S477, E484, and N501 (S1-RBD), and Q677 and P681 (S1-CTD)

  • We focused on artificial mutations in S1-RBD (N439S, L452P, S477G, T478S, E484D, and N501K) that resulted in S protein conferring a lower antigenicity (Table 2) and predicted their protein conformation, human angiotensin converting enzyme 2 (hACE2) binding affinity, and antigenicity

Read more

Summary

Introduction

SARS-CoV-2 is a positive-sense single-stranded RNA virus with emerging mutations, especially on the Spike glycoprotein (S protein). This study documented the evolution of SARS-CoV-2 S protein over the first 16 months of the pandemic and identified several key amino acid changes that are predicted to confer a substantial impact on transmission and immunological recognition. These findings convey crucial information to sequence-based surveillance programs and the design of next-generation vaccines. We predicted the potential amino acid mutations that could arise in favor of SARS-CoV-2 virulence These findings are vital for vaccine designing and anti-SARS-CoV-2 drug discovery in an effort to combat COVID-19. Among the 4 viral structural proteins, S protein harbors the majority of amino acid variations [11, 12]

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.