Abstract

The COVID-19 pandemic, caused by the Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2), was declared on March 11, 2020 by the World Health Organization. As of the 31st of May, 2020, there have been more than 6 million COVID-19 cases diagnosed worldwide and over 370,000 deaths, according to Johns Hopkins. Thousands of SARS-CoV-2 strains have been sequenced to date, providing a valuable opportunity to investigate the evolution of the virus on a global scale. We performed a phylogenetic analysis of over 1,225 SARS-CoV-2 genomes spanning from late December 2019 to mid-March 2020. We identified a missense mutation, D614G, in the spike protein of SARS-CoV-2, which has emerged as a predominant clade in Europe (954 of 1,449 (66%) sequences) and is spreading worldwide (1,237 of 2,795 (44%) sequences). Molecular dating analysis estimated the emergence of this clade around mid-to-late January (10–25 January) 2020. We also applied structural bioinformatics to assess the potential impact of D614G on the virulence and epidemiology of SARS-CoV-2. In silico analyses on the spike protein structure suggests that the mutation is most likely neutral to protein function as it relates to its interaction with the human ACE2 receptor. The lack of clinical metadata available prevented our investigation of association between viral clade and disease severity phenotype. Future work that can leverage clinical outcome data with both viral and human genomic diversity is needed to monitor the pandemic.

Highlights

  • In late December 2019, a cluster of atypical pneumonia cases was reported and epidemiologically linked to a wholesale seafood market in Wuhan, Hubei Province, ­China[1]

  • Based on the high nucleotide identity of SARS-CoV-2 to a bat coronavirus isolate (96%)[7], a possible scenario is that SARS-CoV-2 had undergone a period of adaptation in an as yet identified animal host, facilitating its capacity to jump species boundaries and infect ­humans[3]

  • We performed an initial phylogenetic analysis of 749 SARS-CoV-2 genome sequences from late-December 2019 to March 13, 2020 and noted 152 SARS-CoV-2 sequences initially isolated in Europe beginning in February, 2020, which appear to have emerged as a distinct phylogenetic clade

Read more

Summary

Materials and methods

(excluding non-human sequences specimens) and demographics when available were obtained from GISAID (https://gisaid.org/) (downloaded on 30 March 2020) to perform the statistical analyses of the D614G mutation (8 sequences were excluded due to ambiguous or unknown nucleotide at this position). The two pools of amplicons were combined together and cleaned by Agencourt AMPure XP beads (Beckman Coulter) prior to library preparation using Nextera XT (Illumina) following the manufacturer protocol except that half volume of the reagent was used throughout the protocol. Initial quality filtering was performed by iteratively removing redundant sequences with 100% nucleotide identity, having less than 2,940 informative positions, and lacking informative sampling dates. Following previous recommendations (adapted from https://virological.org/t/issues-with-sars-cov2-sequencing-data/473)[34] to account for regions which might potentially be the result of hypervariability or sequencing artifacts, alignment positions showing significant homoplasy were first identified using Treetime (homoplasy setting). To protect patient privacy and confidentiality, data are reported in an anonymized format

Author contributions
Findings
Additional information
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.