Abstract

Background Large-scale sequencing of SARS-CoV-2 has enabled the study of viral evolution during the COVID-19 pandemic. Some viral mutations may be advantageous to viral replication within hosts but detrimental to transmission, thus carrying a transient fitness advantage. By affecting the number of descendants, persistence times and growth rates of associated clades, these mutations generate localised imbalance in phylogenies. Quantifying these features in closely-related clades with and without recurring mutations can elucidate the tradeoffs between within-host replication and between-host transmission. Methods We implemented a novel phylogenetic clustering algorithm (mlscluster, https://github.com/mrc-ide/mlscluster) to systematically explore time-scaled phylogenies for mutations under transient/multilevel selection. We applied this method for a SARS-CoV-2 time-calibrated phylogeny with >1.2 million sequences from England, and characterised these recurrent mutations that may influence transmission fitness across PANGO-lineages and genomic regions using Poisson regressions and summary statistics. Results We found no major differences across two epidemic stages (before and after Omicron), PANGO-lineages, and genomic regions. However, spike, nucleocapsid, and ORF3a were proportionally more enriched for TFP-homoplasies than other proteins. We provide a catalog of SARS-CoV-2 sites under multilevel selection, which can guide experimental investigations within and beyond the spike protein. Conclusions This study highlights the existence of important tradeoffs between within-host replication and between-host transmission shaping the fitness landscape of SARS-CoV-2.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call