Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants.

Stephen A Smith,Michael J Moore,Joseph W Brown,Ya Yang

doi:10.1186/s12862-015-0423-0

Stephen A Smith, Michael J Moore + Show 2 more

Open Access

https://doi.org/10.1186/s12862-015-0423-0

Copy DOI

Abstract

BackgroundThe use of transcriptomic and genomic datasets for phylogenetic reconstruction has become increasingly common as researchers attempt to resolve recalcitrant nodes with increasing amounts of data. The large size and complexity of these datasets introduce significant phylogenetic noise and conflict into subsequent analyses. The sources of conflict may include hybridization, incomplete lineage sorting, or horizontal gene transfer, and may vary across the phylogeny. For phylogenetic analysis, this noise and conflict has been accommodated in one of several ways: by binning gene regions into subsets to isolate consistent phylogenetic signal; by using gene-tree methods for reconstruction, where conflict is presumed to be explained by incomplete lineage sorting (ILS); or through concatenation, where noise is presumed to be the dominant source of conflict. The results provided herein emphasize that analysis of individual homologous gene regions can greatly improve our understanding of the underlying conflict within these datasets.ResultsHere we examined two published transcriptomic datasets, the angiosperm group Caryophyllales and the aculeate Hymenoptera, for the presence of conflict, concordance, and gene duplications in individual homologs across the phylogeny. We found significant conflict throughout the phylogeny in both datasets and in particular along the backbone. While some nodes in each phylogeny showed patterns of conflict similar to what might be expected with ILS alone, the backbone nodes also exhibited low levels of phylogenetic signal. In addition, certain nodes, especially in the Caryophyllales, had highly elevated levels of strongly supported conflict that cannot be explained by ILS alone.ConclusionThis study demonstrates that phylogenetic signal is highly variable in phylogenomic data sampled across related species and poses challenges when conducting species tree analyses on large genomic and transcriptomic datasets. Further insight into the conflict and processes underlying these complex datasets is necessary to improve and develop adequate models for sequence analysis and downstream applications. To aid this effort, we developed the open source software phyparts (https://bitbucket.org/blackrim/phyparts), which calculates unique, conflicting, and concordant bipartitions, maps gene duplications, and outputs summary statistics such as internode certainy (ICA) scores and node-specific counts of gene duplications.Electronic supplementary materialThe online version of this article (doi:10.1186/s12862-015-0423-0) contains supplementary material, which is available to authorized users.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: BMC Evolutionary Biology	Publication Date: Aug 5, 2015
Citations: 427	License type: CC BY 4.0

R Discovery Prime

R Discovery Prime

Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants.

Abstract

Talk to us

Similar Papers

More From: BMC Evolutionary Biology

Lead the way for us

Similar Papers

Estimation of Species Trees
Diego Mallo ... Leonardo De Oliveira Martins
-
Diego Mallo, et. al.Diego Mallo ... Leonardo De Oliveira Martins
14 Nov 2014
14 Nov 2014

Conflicting phylogenetic signals in the SlX1/Y1 gene in Silene
Anja Rautenberg ... Nahid Heidari
BMC Evolutionary Biology | VOL. 8
Anja Rautenberg, et. al.Anja Rautenberg ... Nahid Heidari
30 Oct 2008
BMC Evolutionary Biology | VOL. 8

The synergistic effect of concatenation in phylogenomics: the case in Pantoea.
Marike Palmer ... Gerda Fourie
PeerJ | VOL. 7
Marike Palmer, et. al.Marike Palmer ... Gerda Fourie
16 Apr 2019
PeerJ | VOL. 7

Phylogenomic approaches untangle early divergences and complex diversifications of the olive plant family
Wenpan Dong ... Yushuang Wang
BMC Biology | VOL. 20
Wenpan Dong, et. al.Wenpan Dong ... Yushuang Wang
25 Apr 2022
BMC Biology | VOL. 20

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Analysis of phylogenomic datasets reveals conflict, concordance, and gene duplications with examples from animals and plants.

Abstract

Talk to us

Similar Papers

More From: BMC Evolutionary Biology