We present Synesth, the most comprehensive and flexible tool for tree reconciliation that allows for events on syntenies (i.e., on sets of multiple genes), including duplications, transfers, fissions, and transient events going through unsampled species. This model allows for building histories that explicate the inconsistencies between a synteny tree and its associated species tree. We examine the combinatorial properties of this extended reconciliation model and study various associated parsimony problems. First, the infinite set of explicatory histories is reduced to a finite but exponential set of Pareto-optimal histories (in terms of counts of each event type), then to a polynomial set of Pareto-optimal event count vectors, and this eventually ends with minimum event cost histories given an event cost function. An inductive characterization of the solution space using different algebras for each granularity leads to efficient dynamic programming algorithms, ultimately ending with an O(mn) time complexity algorithm for computing the cost of a minimum-cost history (m and n: number of nodes in the input synteny and species trees). This time complexity matches that of the fastest known algorithms for classical gene reconciliation with transfers. We show how Synesth can be applied to infer Pareto-optimal evolutionary scenarios for CRISPR-Cas systems in a set of bacterial genomes.
Read full abstract