Abstract

Given a set of n data objects and their pairwise dissimilarities, the goal of quartet clustering is to construct an optimal tree from the total number of possible combinations of quartet topologies on n, where optimality means that the sum of the dissimilarities of the embedded (or consistent) quartet topologies is minimal. This corresponds to an NP-hard combinatorial optimization problem, also referred to as minimum quartet tree cost (MQTC) problem. We provide details and formulation of this challenging problem, and propose a basic greedy heuristic that is characterized by a very high speed and some interesting implementation details. The solution approach, though simple, substantially improves the performance of a Reduced Variable Neighborhood Search for the MQTC problem. The latter is one of the most popular heuristic algorithms for tackling the MQTC problem.

Highlights

  • Quartet clustering methods are popular in computational biology, where dendrograms are ubiquitous

  • Given a set of n data objects and their pairwise dissimilarities, the goal of quartet clustering is to construct an optimal tree from the total number of possible combinations of quartet topologies on n, where optimality means that the sum of the dissimilarities of the embedded quartet topologies is minimal

  • The best reported performance was obtained by an implementation of a Reduced Variable Neighbourhood Search metaheuristic, which we will use as a reference benchmark in our paper and try to overcome its performance

Read more

Summary

Introduction

Quartet clustering methods are popular in computational biology, where dendrograms (or phylogenies) are ubiquitous. The MQTC problem assigns a cost value to each simple quartet topology, in order to express the relative importance of the simple quartet topologies to be embedded in the full unrooted binary tree having the n objects as leaves. The full unrooted binary tree with the minimum cost balances the importance of embedding different quartet topologies against others, leading to a binary tree that visually represents the symmetric distance matrix n × n as well as possible The solution of this problem allows the hierarchical representation of a set of n objects within a full unrooted binary tree [12]. The resulting binary tree will have the n objects assigned as leaves such that objects with short relative dissimilarities will be placed close to each other in the tree This hierarchical clustering approach coming from the MQTC problem is referred in the literature to as quartet method [4].

Related Work
Greedy Constructive Heuristic
Reduced Variable Neighbourhood Search
Computational Results
Conclusions
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.