Abstract
To provide a comprehensive analysis of the SARS-CoV-2 sequence diversity in Poland in the European context. All publicly available (n = 115; GISAID database) whole-genome SARS-Cov-2 sequences from Polish samples, including those obtained during coronavirus testing performed in our COVID-19 Lab, were examined. Multiple sequence alignment of Polish isolates, phylogenetic analysis (ML tree), and multidimensional scaling (based on the pairwise DNA distances) were complemented by the comparison of the coronavirus clades frequency and diversity in the subset of over 5000 European GISAID sequences. Approximately seventy-seven percent of isolates in the European dataset carried frequent and ubiquitously found haplotypes; the remaining haplotype diversity was population-specific and resulted from population-specific mutations, homoplasies, and recombinations. Coronavirus strains circulating in Poland represented the variability found in other European countries. The prevalence of clades circulating in Poland was shifted in favor of GR, both in terms of the diversity (number of distinct haplotypes) and the frequency (number of isolates) of the clade. Polish-specific haplotypes were rare and could be explained by changes affecting common European strains. The analysis of the whole viral genomes allowed detection of several tight clusters of isolates, presumably reflecting local outbreaks. New mutations, homoplasies, and, to a smaller extent, recombinations increase SARS-CoV-2 haplotype diversity, but the majority of these variants do not increase in frequency and remains rare and population-specific. The spectrum of SARS-CoV-2 haplotypes in the Polish dataset reflects many independent transfers from a variety of sources, followed by many local outbreaks. The prevalence of the sequences belonging to the GR clade among Polish isolates is consistent with the European trend of the GR clade frequency increase.
Highlights
The first reports of a new type of coronavirus-caused pneumonia, of -unknown etiology, have been reported on 17/ 11/2019 in Wuhan, Hubei Province, China
We examined 115 whole-genome SARSCov-2 sequences from Polish samples, i.e., all currently available in the GISAID database, including those obtained during coronavirus testing performed in the COVID-19 Lab at the Institute of Human Genetics PAS
Haplotypes within the clades were distinguished based on the shared presence of mutations along the SARS-CoV-2 genomic sequence; positions with mutations in fewer than 3% (16/5013) of the European set of sequences were not included in haplotype definition
Summary
The first reports of a new type of coronavirus-caused pneumonia, of -unknown etiology, have been reported on 17/ 11/2019 in Wuhan, Hubei Province, China. The incidence of the disease has soon increased exponentially, moving to other regions of the world (Tang et al 2020a). At the end of January 2020, COVID-19, the severe acute respiratory syndrome caused by the infection with coronavirus type 2, has been declared by the World Health Organization (WHO) a public health emergency of international importance. COVID-19 has taken on huge proportions, soon reaching a pandemic dimension. At the date of manuscript completing (5/9/2020), there were more than 27 million confirmed cases of the disease, with more than 880,000 deaths recorded worldwide (Coronavirus update (Live) n.d.); in Europe, over 3.7 million cases and 209,000 casualties were reported.
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have