Abstract

Introduction: Although most laboratories are capable of employing established protocols to perform full-genome SARS-CoV-2 sequencing, many are unable to assess sequence quality, select appropriate mutation-detection thresholds, or report on the potential clinical significance of mutations in the targets of antiviral therapyMethods: We describe the technical aspects and benchmark the performance of Sierra SARS-CoV-2, a program designed to perform these functions on user-submitted FASTQ and FASTA sequence files and lists of Spike mutations. Sierra SARS-CoV-2 indicates which sequences contain an unexpectedly large number of unusual mutations and which mutations are associated with reduced susceptibility to clinical stage mAbs, the RdRP inhibitor remdesivir, or the Mpro inhibitor nirmatrelvirResults: To assess the performance of Sierra SARS-CoV-2 on FASTQ files, we applied it to 600 representative FASTQ sequences and compared the results to the COVID-19 EDGE program. To assess its performance on FASTA files, we applied it to nearly one million representative FASTA sequences and compared the results to the GISAID mutation annotation. To assess its performance on mutations lists, we applied it to 13,578 distinct Spike RBD mutation patterns and showed that exactly or partially matching annotations were available for 88% of patternsConclusion: Sierra SARS-CoV-2 leverages previously published data to improve the quality control of submitted viral genomic data and to provide functional annotation on the impact of mutations in the targets of antiviral SARS-CoV-2 therapy. The program can be found at https://covdb.stanford.edu/sierra/sars2/ and its source code at https://github.com/hivdb/sierra-sars2.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call