The COVID-19 pandemic brought forth an urgent need for widespread genomic surveillance for rapid detection and monitoring of emerging SARS-CoV-2 variants. It necessitated design, development, and deployment of a nationwide infrastructure designed for sequestration, consolidation, and characterization of patient samples that disseminates de-identified information to public authorities in tight turnaround times. Here, we describe our development of such an infrastructure, which sequenced 594,832 high coverage SARS-CoV-2 genomes from isolates we collected in the United States (U.S.) from March 13th 2020 to July 3rd 2023. Our sequencing protocol (‘Virseq’) utilizes wet and dry lab procedures to generate mutation-resistant sequencing of the entire SARS-CoV-2 genome, capturing all major lineages. We also characterize 379 clinically relevant SARS-CoV-2 multi-strain co-infections and ensure robust detection of emerging lineages via simulation. The modular infrastructure, sequencing, and analysis capabilities we describe support the U.S. Centers for Disease Control and Prevention national surveillance program and serve as a model for rapid response to emerging pandemics at a national scale.
Read full abstract