Abstract
BackgroundDiscovering the location of gene duplications and multiple gene duplication episodes is a fundamental issue in evolutionary molecular biology. The problem introduced by Guigó et al. in 1996 is to map gene duplication events from a collection of rooted, binary gene family trees onto theirs corresponding rooted binary species tree in such a way that the total number of multiple gene duplication episodes is minimized. There are several models in the literature that specify how gene duplications from gene families can be interpreted as one duplication episode. However, in all duplication episode problems gene trees are rooted. This restriction limits the applicability, since unrooted gene family trees are frequently inferred by phylogenetic methods.ResultsIn this article we show the first solution to the open problem of episode clustering where the input gene family trees are unrooted. In particular, by using theoretical properties of unrooted reconciliation, we show an efficient algorithm that reduces this problem into the episode clustering problems defined for rooted trees. We show theoretical properties of the reduction algorithm and evaluation of empirical datasets.ConclusionsWe provided algorithms and tools that were successfully applied to several empirical datasets. In particular, our comparative study shows that we can improve known results on genomic duplication inference from real datasets.
Highlights
Discovering the location of gene duplications and multiple gene duplication episodes is a fundamental issue in evolutionary molecular biology
In this article we present the first solution to the open problem [27] of unrooted episode clustering, that is, the problem of episode clustering where the input consists of unrooted gene trees
Basic notation A species tree is a rooted binary tree with leaves uniquely labeled by the names of species
Summary
Discovering the location of gene duplications and multiple gene duplication episodes is a fundamental issue in evolutionary molecular biology. Genomic duplication plays important role in evolution of life on Earth. While the reconstruction of evolutionary history of individual genes is generally well established [8,9,10,11,12,13], still little is known on the inference of large genomic duplications that can span through thousands of genes families. In this approach we propose to use the model of reconciliation in which a gene tree is reconciled with its species tree. The reconstruction of large gene duplication events may be difficult
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.