Abstract

With over 16 million submitted genomic sequences, the SARS-CoV-2 (SC2) virus, the cause of the most recent worldwide COVID-19 pandemic, has become the most sequenced genome of all known viruses, revealing, for example, a vast number of expanding viral lineages. Since the pandemic phase appears to be over, we performed a retrospective re-examination of the demographic grouping pattern and their genomic characteristics during the entire pandemic period up to the peak of the last pandemic wave. For our study, we extracted from the NCBI only unique viral sequences and converted each sequence data to a relational vector, indicating the presence/absence of each variational event compared to a “reference” sequence. Our study revealed several genomic features that are unexpected or different from those of previous studies. For example, approximately 44,000 variants with unique sequences emerged during the pandemic period; they group into only four major viral-genomic groups and each has a set of mostly unique highly-conserved variant-genotypes (HCVGs); and a small set from the first (“ancestral”) group was inherited by the three (“descendant”) groups, suggesting that HCVGs in the next group may be predictable from the current group(s). Such a concept may be potentially important in designing “panvalent” vaccines against the current and future waves of viral infections.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call