SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels

Adriano Di Pasquale,Nicolas Radomski,Cesare Cammà,Paolo Calistri,Alessio Lorusso,Iolanda Mangone

doi:10.1186/s12864-021-08112-0

Abstract

BackgroundFaced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model.ResultsGENPAT constituted two collections of SARS-CoV-2 samples. The first collection consisted of SARS-CoV-2 positive swabs collected by IZSAM from the Abruzzo region (Italy), then sequenced by next generation sequencing (NGS) and analyzed in GENPAT (n = 1592), while the second collection included samples from several Italian provinces and retrieved from the reference Global Initiative on Sharing All Influenza Data (GISAID) (n = 17,201). The main results of the present work showed that (i) GENPAT and GISAID detected the same PANGO lineages, (ii) the PANGO lineages B.1.177 (i.e. historical in Italy) and B.1.1.7 (i.e. ‘UK variant’) are major concerns today in several Italian provinces, and the new MST-based method (iii) clusters most of the PANGO lineages together, (iv) with a higher dicriminatory power than PANGO lineages, (v) and faster that the usual phylogenomic methods based on MSA and substitution model.ConclusionsThe genome sequencing efforts of Italian provinces, combined with a structured national system of NGS data management, provided support for surveillance SARS-CoV-2 in Italy. We propose to build phylogenomic trees of SARS-CoV-2 variants through an accurate, discriminant and fast MST-based method avoiding the typical time consuming steps related to MSA and substitution model-based phylogenomic inference.

Highlights

The coronavirus disease 19 (COVID-19) responsible to the current pandemic is due to a novel coronavirus (CoV) named SARS-CoV-2 [1]
Considering that the SARS-CoV-2 surveillance needs an accurate, discriminant and fast assessment of epidemiological clusters from substantial amount of samples, the present study provides a variant calling analysis-based workflow, so-called GENPAT workflow, to accurately identify the PANGO lineages of SARS-CoV-2 samples in Italy and rapidly build highly discriminant minimum spanning tree (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model
The questions described above were assessed with the GENPAT workflow combining the identification of PANGO lineages based on variant calling analysis (Fig. 1) and MST-based phylogenomic inference (Figs. 1 and 2), as well as two collections of SARS-CoV-2 samples isolated until April 2021 in Italy from GENPAT (Additional file 1) and Global Initiative on Sharing All Influenza Data (GISAID) (Additional file 2)

Summary

Introduction

The coronavirus disease 19 (COVID-19) responsible to the current pandemic is due to a novel coronavirus (CoV) named SARS-CoV-2 [1]. At the date of the present study (May 2021), 222 countries were affected by the SARS-CoV-2 with 153,527,666 coronavirus cases, as well as 3,217,267 deaths, 680,364 daily new cases, and 9981 daily deaths [5]. Patients with severe COVID-19 disease may develop pneumonia or acute respiratory distress syndrome (ARDS), which is often fatal [9] and requires mechanical ventilation and treatment from intensive care unit [8]. Faced with the ongoing global pandemic of coronavirus disease, the ‘National Reference Centre for Whole Genome Sequencing of microbial pathogens: database and bioinformatic analysis’ (GENPAT) formally established at the ‘Istituto Zooprofilattico Sperimentale dell’Abruzzo e del Molise’ (IZSAM) in Teramo (Italy) is in charge of the SARS-CoV-2 surveillance at the genomic scale. In a context of SARS-CoV-2 surveillance requiring correct and fast assessment of epidemiological clusters from substantial amount of samples, the present study proposes an analytical workflow for identifying accurately the PANGO lineages of SARS-CoV-2 samples and building of discriminant minimum spanning trees (MST) bypassing the usual time consuming phylogenomic inferences based on multiple sequence alignment (MSA) and substitution model

Methods

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Genomics	Publication Date: Oct 30, 2021
Citations: 12	License type: open-access

R Discovery Prime

R Discovery Prime

SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics

Lead the way for us

Similar Papers

Genotyping SARS-CoV-2 through an interactive web application
Hassaan Maan ... Bo Wang
The Lancet Digital Health | VOL. 2
Hassaan Maan, et. al.Hassaan Maan ... Bo Wang
12 Jun 2020
The Lancet Digital Health | VOL. 2

Genomic analysis of SARS-CoV-2 strains among Indians returning from Italy, Iran & China, & Italian tourists in India.
Savita Patil ...
Indian Journal of Medical Research | VOL. 151
Savita Patil, et. al.Savita Patil ...
01 Jan 2020
Indian Journal of Medical Research | VOL. 151

Genomic analysis of SARS-CoV-2 variants of concern circulating in Hawai'i to facilitate public-health policies.
David P Maison ... Vivek R Nerurkar
PloS one | VOL. 17
David P Maison, et. al.David P Maison ... Vivek R Nerurkar
01 Dec 2022
PloS one | VOL. 17

Phylogenetic classification of the whole-genome sequences of SARS-CoV-2 from India & evolutionary trends.
Sarah Cherian ... Santosh Jadhav
Indian Journal of Medical Research | VOL. 153
Sarah Cherian, et. al.Sarah Cherian ... Santosh Jadhav
01 Jan 2020
Indian Journal of Medical Research | VOL. 153

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

SARS-CoV-2 surveillance in Italy through phylogenomic inferences based on Hamming distances derived from pan-SNPs, -MNPs and -InDels

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: BMC Genomics