Abstract

BackgroundA mechanistic understanding of the spread of SARS-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission. Large numbers of available sequences and their dates of transmission provide an unprecedented opportunity to analyze evolutionary adaptation in novel ways. Addition of high-resolution structural information can reveal the functional basis of these processes at the molecular level. Integrated systems biology-directed analyses of these data layers afford valuable insights to build a global understanding of the COVID-19 pandemic.ResultsHere we identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp614Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro323Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model also suggests that two mutations in the nsp13 helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Finally, our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that also displays a signature of positive selection and may have implications for tissue or cell-specific expression of the virus.ConclusionsThese results provide valuable insights for the development of drugs and surveillance strategies to combat the current and future pandemics.

Highlights

  • A mechanistic understanding of the spread of severe acute respiratory syndrome (SARS)-CoV-2 and diligent tracking of ongoing mutagenesis are of key importance to plan robust strategies for confining its transmission

  • A recent study focused on mutations in the spike glycoprotein and gave indications that both positive selection and recombination may be occurring at the molecular level [5, 6]

  • Overview of the approach The RNA genome of SARS-CoV-2 is enveloped by a lipidic membrane and its structural proteins, namely, spike glycoprotein (S), envelope (E), membrane glycoprotein (M), and nucleocapsid (N)

Read more

Summary

Results

We identify globally distributed haplotypes from 15,789 SARS-CoV-2 genomes and model their success based on their duration, dispersal, and frequency in the host population. Our models identify mutations that are likely compensatory adaptive changes that allowed for rapid expansion of the virus. Functional predictions from structural analyses indicate that, contrary to previous reports, the Asp614Gly mutation in the spike glycoprotein (S) likely reduced transmission and the subsequent Pro323Leu mutation in the RNA-dependent RNA polymerase led to the precipitous spread of the virus. Our model suggests that two mutations in the nsp helicase allowed for the adaptation of the virus to the Pacific Northwest of the USA. Our explainable artificial intelligence algorithm identified a mutational hotspot in the sequence of S that displays a signature of positive selection and may have implications for tissue or cellspecific expression of the virus

Background
Results and discussion
Methods
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.