Abstract

The surveillance of COVID-19 pandemic has led to the determination of millions of genome sequences of the SARS-CoV-2 virus, with the accumulation of a wealth of information never collected before for an infectious disease. Exploring the information retrieved from the GISAID database reporting at that time >13 million genome sequences, we classified the 141,639 unique missense mutations detected in the first two-and-a-half years (up to October 2022) of the pandemic. Notably, our analysis indicates that 98.2 % of all possible conservative amino acid replacements occurred. Even non-conservative mutations were highly represented (73.9 %). For a significant number of residues (3 %), all possible replacements with the other nineteen amino acids have been observed. These observations strongly indicate that, in this time interval, the virus explored all possible alternatives in terms of missense mutations for all sites of its polypeptide chain and that those that are not observed severely affect SARS-CoV-2 integrity. The implications of the present findings go well beyond the structural biology of SARS-CoV-2 as the huge amount of information here collected and classified may be valuable for the elucidation of the sequence-structure-function relationships in proteins.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call