Computing nearest neighbour interchange distances between ranked phylogenetic trees

Lena Collienne,Alex Gavryushkin

doi:10.1007/s00285-021-01567-5

Lena Collienne, Alex Gavryushkin

Open Access

https://doi.org/10.1007/s00285-021-01567-5

Copy DOI

Journal: Journal of mathematical biology	Publication Date: Jan 1, 2021
Citations: 10	License type: open-access

Affiliation: University of Otago

Abstract

Many popular algorithms for searching the space of leaf-labelled (phylogenetic) trees are based on tree rearrangement operations. Under any such operation, the problem is reduced to searching a graph where vertices are trees and (undirected) edges are given by pairs of trees connected by one rearrangement operation (sometimes called a move). Most popular are the classical nearest neighbour interchange, subtree prune and regraft, and tree bisection and reconnection moves. The problem of computing distances, however, is {mathbf {N}}{mathbf {P}}-hard in each of these graphs, making tree inference and comparison algorithms challenging to design in practice. Although ranked phylogenetic trees are one of the central objects of interest in applications such as cancer research, immunology, and epidemiology, the computational complexity of the shortest path problem for these trees remained unsolved for decades. In this paper, we settle this problem for the ranked nearest neighbour interchange operation by establishing that the complexity depends on the weight difference between the two types of tree rearrangements (rank moves and edge moves), and varies from quadratic, which is the lowest possible complexity for this problem, to {mathbf {N}}{mathbf {P}}-hard, which is the highest. In particular, our result provides the first example of a phylogenetic tree rearrangement operation for which shortest paths, and hence the distance, can be computed efficiently. Specifically, our algorithm scales to trees with tens of thousands of leaves (and likely hundreds of thousands if implemented efficiently).

Highlights

We thank Alexei Drummond, David Bryant, and Kieran Elmes for useful discussions about the weight difference between RNNI moves, complexity, and applied aspects of our results
For example in species evolution, where internal nodes of trees correspond to speciation events, the ranking of these nodes represents the order of divergence events in time
Most tree inference methods rely on various tree rearrangement operations (Semple and Steel 2003), the most popular of which are nearest neighbour interchange (NNI), subtree prune and regraft (SPR), and tree bisection and reconnection (TBR)

Summary

Page 2 of 19

One of the major problems in computational biology is the reconstruction of evolutionary histories, known as phylogenetic trees, from sequence data such as RNA, DNA, or protein sequences. Most tree inference methods rely on various tree rearrangement operations (Semple and Steel 2003), the most popular of which are nearest neighbour interchange (NNI), subtree prune and regraft (SPR), and tree bisection and reconnection (TBR). Computing the NNI distance is known to be fixed parameter tractable (DasGupta et al 1999) Important, these algorithms remain impractical for large distances and are only applied to trees with a moderate number of leaves or those with small distances (Whidden and Matsen 2018). The Robinson–Foulds distance is not motivated by a biological process, unlike for example SPR, where the tree rearrangement operation can be used to model hybridisation and other horizontal events This pattern is quite common—tree distance measures that are easy to compute lack biological interpretability, while those that are biologically meaningful are often hard to compute (Whidden and Matsen 2018). Because NNI can be seen as a special case of RNNI, we investigate whether there exists a threshold at which the complexity of the shortest path problem shifts from

Page 4 of 19

Definitions and background results

Page 6 of 19

FINDPATH algorithm

FINDPATH computes shortest paths in optimal time

Page 8 of 19

Page 10 of 19

Page 12 of 19

Page 14 of 19

Page 16 of 19

Additional open problems

Page 18 of 19

Full Text

Published version (

Free)

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Computing nearest neighbour interchange distances between ranked phylogenetic trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of mathematical biology

Lead the way for us

Similar Papers

Rearrangement operations on unrooted phylogenetic networks
Remie Janssen ...
Theory and Application of Graphs | VOL. 06
Remie Janssen, et. al.Remie Janssen ...
01 Jan 2019
Theory and Application of Graphs | VOL. 06

Distances between phylogenetic trees: A survey
Feng Shi ... Lusheng Wang
Tsinghua Science & Technology | VOL. 18
Feng Shi, et. al.Feng Shi ... Lusheng Wang
01 Oct 2013
Tsinghua Science & Technology | VOL. 18

Subtree Transfer Operations and Their Induced Metrics on Evolutionary Trees
Benjamin L Allen ... Mike Steel
Annals of Combinatorics | VOL. 5
Benjamin L Allen, et. al.Benjamin L Allen ... Mike Steel
01 Jun 2001
Annals of Combinatorics | VOL. 5

The agreement distance of unrooted phylogenetic networks

arXiv (Cornell University) | VOL. -

21 Aug 2019
arXiv (Cornell University) | VOL. -

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Computing nearest neighbour interchange distances between ranked phylogenetic trees

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: Journal of mathematical biology