Abstract

BackgroundWhile tree-oriented methods for inferring orthology and paralogy relations between genes are based on reconciling a gene tree with a species tree, many tree-free methods are also available (usually based on sequence similarity). Recently, the link between orthology relations and gene trees has been formally considered from the perspective of reconstructing phylogenies from orthology relations. In this paper, we consider this link from a correction point of view. Indeed, a gene tree induces a set of relations, but the converse is not always true: a set of relations is not necessarily in agreement with any gene tree. A natural question is thus how to minimally correct an infeasible set of relations. Another natural question, given a gene tree and a set of relations, is how to minimally correct a gene tree so that the resulting gene tree fits the set of relations.ResultsWe consider four variants of relation and gene tree correction problems, and provide hardness results for all of them. More specifically, we show that it is NP-Hard to edit a minimum of set of relations to make them consistent with a given species tree. We also show that the problem of finding a maximum subset of genes that share consistent relations is hard to approximate. We then demonstrate that editing a gene tree to satisfy a given set of relations in a minimum way is NP-Hard, where “minimum” refers either to the number of modified relations depicted by the gene tree or the number of clades that are lost. We also discuss some of the algorithmic perspectives given these hardness results.

Highlights

  • While tree-oriented methods for inferring orthology and paralogy relations between genes are based on reconciling a gene tree with a species tree, many tree-free methods are available

  • We developed the first algorithm for gene tree correction using orthology relations [7]

  • The maximum clade correction problem Maximum clade correction problem: Input: A gene tree G, a species tree S, a set O of orthology and a set P of paralogy relations and an integer k; Output: “Yes” if there exists an S-consistent DS-tree G′ satisfying O and P such that G and G′ have at least k clades in common

Read more

Summary

Introduction

While tree-oriented methods for inferring orthology and paralogy relations between genes are based on reconciling a gene tree with a species tree, many tree-free methods are available (usually based on sequence similarity). Minimum edge-removal consistency problem: Input: A relation graph R for a gene family Ŵ, a species tree S and an integer k; Output: “Yes” if and only if there exists an S-consistent subgraph R′ of R with V (R′) = V (R) such that |E(R) \ E(R′)| ≤ k.

Results
Conclusion
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call