Abstract
Wastewater surveillance of SARS-CoV-2 has emerged as a critical tool for tracking the spread of COVID-19. In addition to estimating the relative case numbers using quantitative PCR, SARS-CoV-2 genomic RNA can be extracted from wastewater and sequenced. There are many existing techniques for using the sequenced RNA to determine the relative abundance of known lineages in a sample. However, it is very challenging to predict novel lineages from wastewater data due to its mixed composition and unreliable genomic coverage. In this work, we present a novel technique based on non-negative matrix factorization which is able to reconstruct lineage definitions by analyzing data from across different samples. We test the method both on synthetic and real wastewater sequencing data. We show that the technique is able to determine major lineages such as Omicron and Delta as well as sub-lineages such as BA.5.2.1. We provide a method for determining emerging lineages in wastewater without the need for genomic data from clinical samples. This could be used for routine monitoring of SARS-CoV-2 as well as other emerging viral pathogens in wastewater. Additionally, it may be used to determine more full-genome sequences for viruses with fewer available genomes.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.