Abstract
BackgroundRooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination. Various methods have been developed to construct such networks, using for example a multiple sequence alignment or multiple phylogenetic trees as input data. Coronaviruses are known to recombine frequently, but rooted phylogenetic networks have not yet been used extensively to describe their evolutionary history. Here, we created a workflow to compare the evolutionary history of SARS-CoV-2 with other SARS-like viruses using several rooted phylogenetic network inference algorithms. This workflow includes filtering noise from sets of phylogenetic trees by contracting edges based on branch length and bootstrap support, followed by resolution of multifurcations. We explored the running times of the network inference algorithms, the impact of filtering on the properties of the produced networks, and attempted to derive biological insights regarding the evolution of SARS-CoV-2 from them.ResultsThe network inference algorithms are capable of constructing rooted phylogenetic networks for coronavirus data, although running-time limitations require restricting such datasets to a relatively small number of taxa. Filtering generally reduces the number of reticulations in the produced networks and increases their temporal consistency. Taxon bat-SL-CoVZC45 emerges as a major and structural source of discordance in the dataset. The tested algorithms often indicate that SARS-CoV-2/RaTG13 is a tree-like clade, with possibly some reticulate activity further back in their history. A smaller number of constructed networks posit SARS-CoV-2 as a possible recombinant, although this might be a methodological artefact arising from the interaction of bat-SL-CoVZC45 discordance and the optimization criteria used.ConclusionOur results demonstrate that as part of a wider workflow and with careful attention paid to running time, rooted phylogenetic network algorithms are capable of producing plausible networks from coronavirus data. These networks partly corroborate existing theories about SARS-CoV-2, and partly produce new avenues for exploration regarding the location and significance of reticulate activity within the wider group of SARS-like viruses. Our workflow may serve as a model for pipelines in which phylogenetic network algorithms can be used to analyse different datasets and test different hypotheses.
Highlights
Rooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination
The main aim of this paper is to create a workflow using various phylogenetic networks algorithms proposed in the literature and to study their adequacy for explaining the evolutionary history of a selection of coronaviruses including SARS-CoV-2, the virus responsible for COVID-19
Constructing networks directly from the original binary trees would lead to spurious hypotheses of reticulation, i.e. reticulations in the network that are caused by noise rather than by actual reticulate evolutionary events
Summary
Rooted phylogenetic networks are used to display complex evolutionary history involving so-called reticulation events, such as genetic recombination. We created a workflow to compare the evolutionary history of SARS-CoV-2 with other SARS-like viruses using several rooted phylogenetic network inference algorithms. This workflow includes filtering noise from sets of phylogenetic trees by contracting edges based on branch length and bootstrap support, followed by resolution of multifurcations. Coronaviruses are known to recombine frequently, resulting in new variants Such reticulate evolutionary phenomena can potentially confound the construction of phylogenetic trees [1,2,3,4,5]. Biological background of coronaviruses Since the beginning of 2020, the entire world has been greatly impacted by the outbreak of COVID-19, caused by the Severe Acute Respiratory Syndrome CoronaVirus 2 (SARS-CoV-2). This series of outbreaks has led to extensive public and scientific interest in the origin and evolution of these coronaviruses, in order to prevent possible future outbreaks by other coronaviruses and to accelerate the development of medicines and vaccines
Published Version (Free)
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have