Abstract

ABSTRACTThe severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has resulted in 92 million cases in a span of 1 year. The study focuses on understanding population-specific variations attributing its high rate of infections in specific geographical regions particularly in the United States. Rigorous phylogenomic network analysis of complete SARS-CoV-2 genomes (245) inferred five central clades named a (ancestral), b, c, d, and e (subtypes e1 and e2). Clade d and subclade e2 were found exclusively comprised of U.S. strains. Clades were distinguished by 10 co-mutational combinations in Nsp3, ORF8, Nsp13, S, Nsp12, Nsp2, and Nsp6. Our analysis revealed that only 67.46% of single nucleotide polymorphism (SNP) mutations were at the amino acid level. T1103P mutation in Nsp3 was predicted to increase protein stability in 238 strains except for 6 strains which were marked as ancestral type, whereas co-mutation (P409L and Y446C) in Nsp13 were found in 64 genomes from the United States highlighting its 100% co-occurrence. Docking highlighted mutation (D614G) caused reduction in binding of spike proteins with angiotensin-converting enzyme 2 (ACE2), but it also showed better interaction with the TMPRSS2 receptor contributing to high transmissibility among U.S. strains. We also found host proteins, MYO5A, MYO5B, and MYO5C, that had maximum interaction with viral proteins (nucleocapsid [N], spike [S], and membrane [M] proteins). Thus, blocking the internalization pathway by inhibiting MYO5 proteins which could be an effective target for coronavirus disease 2019 (COVID-19) treatment. The functional annotations of the host-pathogen interaction (HPI) network were found to be closely associated with hypoxia and thrombotic conditions, confirming the vulnerability and severity of infection. We also screened CpG islands in Nsp1 and N conferring the ability of SARS-CoV-2 to enter and trigger zinc antiviral protein (ZAP) activity inside the host cell.IMPORTANCE In the current study, we presented a global view of mutational pattern observed in SARS-CoV-2 virus transmission. This provided a who-infect-whom geographical model since the early pandemic. This is hitherto the most comprehensive comparative genomics analysis of full-length genomes for co-mutations at different geographical regions especially in U.S. strains. Compositional structural biology results suggested that mutations have a balance of opposing forces affecting pathogenicity suggesting that only a few mutations are effective at the translation level. Novel HPI analysis and CpG predictions elucidate the proof of concept of hypoxia and thrombotic conditions in several patients. Thus, the current study focuses the understanding of population-specific variations attributing a high rate of SARS-CoV-2 infections in specific geographical regions which may eventually be vital for the most severely affected countries and regions for sharp development of custom-made vindication strategies.

Highlights

  • The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has resulted in 92 million cases in a span of 1 year

  • We analyzed the transfer of genomic single nucleotide polymorphism (SNP) to amino acid levels and associations of CpG dinucleotides contributing toward the pathogenicity of SARS-CoV-2, since the CpG islands have always been linked with epigenetic regulation and act as the hot spots for methylation in the case of viruses [11,12,13]

  • We found co-mutations in Nsp3, ORF8, Nsp13, S, Nsp12, Nsp2, and Nsp6 were responsible for the above divergence

Read more

Summary

Introduction

The severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) pandemic has resulted in 92 million cases in a span of 1 year. IMPORTANCE In the current study, we presented a global view of mutational pattern observed in SARS-CoV-2 virus transmission This provided a who-infect-whom geographical model since the early pandemic. In our previous study [5], a higher mutational rate in the genomes from different geographical locations around the world by accumulation of single nucleotide polymorphisms (SNPs) was reported. Even during these early stages of the global pandemic, genomic surveillance has been used to differentiate circulating strains into distinct, geographically based lineages [6]. The conservancy found in possession of CpG dinucleotides towards the extremities of all the genomes considered in the present analysis indicate their importance in evading host immunity

Methods
Results
Conclusion

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.