Abstract
BackgroundFaced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model.ResultsGENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model.ConclusionsThe genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference.
Highlights
The coronavirus disease 19 (COVID-19) responsible to the current pandemic is due to a novel coronavirus (CoV) named SARS-CoV-2 [1]
Considering that the SARS-CoV-2 surveillance needs an accurate, discriminant and fast assessment of epidemiological clusters from substantial amount of samples, the present study provides a variant calling analysis-based workflow, so-called GENPAT workflow, to accurately identify the PANGO lineages of SARS-CoV-2 samples in Italy and rapidly build highly discriminant minimum spanning tree (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model
The questions described above were assessed with the GENPAT workflow combining the identification of PANGO lineages based on variant calling analysis (Fig. 1) and MST-based phylogenomic inference (Figs. 1 and 2), as well as two collections of SARS-CoV-2 samples isolated until April 2021 in Italy from GENPAT (Additional file 1) and Global Initiative on Sharing All Influenza Data (GISAID) (Additional file 2)
Summary
The coronavirus disease 19 (COVID-19) responsible to the current pandemic is due to a novel coronavirus (CoV) named SARS-CoV-2 [1]. At the date of the present study (May 2021), 222 countries were affected by the SARS-CoV-2 with 153,527,666 coronavirus cases, as well as 3,217,267 deaths, 680,364 daily new cases, and 9981 daily deaths [5]. Patients with severe COVID-19 disease may develop pneumonia or acute respiratory distress syndrome (ARDS), which is often fatal [9] and requires mechanical ventilation and treatment from intensive care unit [8]. Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.