Abstract

COVID-19 infection caused by the SARS-CoV-2 virus has produced several outbreaks which have had a high rate of viral mutation related to lethality and transmissibility. Although these mutations occur throughout the viral genome, missense mutations in Spike (S) protein are key to understanding how viral adaptation to target cells occurs. For these reasons, it is necessary to identify spatial–temporal mutation patterns in S-protein from genomic databases that could be correlated to epidemiological parameters. In S-SPAM (Search method for Spatio-temporal Patterns of Mutations) our proposed method, Topological Data Analysis (TDA), and Machine Learning have been used to find spatial–temporal mutation patterns to compare them with their corresponding epidemiological data, which have been extracted from NCBI (National Center for Biotechnology Information) databases [1]. D614G mutation was associated with an increase in reported cases of infection per day, while S477N was associated with an increase in deaths. Finally, a high rate of variability on the NTD (N-terminal domain) region of S-protein was identified, suggesting that it could be related to the host’s immune response modulation and increase in virulence factors.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call