Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods

Richard H Adams,Todd A Castoe

doi:10.1016/j.mex.2019.09.025

Abstract

Genome-scale species tree inference is largely restricted to heuristic approaches that use estimated gene trees to reconstruct species-level relationships. Central to these heuristic species tree methods is the assumption that the gene trees are estimated without error. To increase the accuracy of input gene trees used to infer species trees, several techniques have recently been developed for constructing longer “supergenes” that represent sets of loci inferred to share the same genealogical history. While these supergene methods are designed to increase the amount of data for gene tree estimation by concatenating several loci into “supergenes” to increase gene tree accuracy, no formal protocols have been proposed to validate this key “supergene” concatenation step. In a recent study, we developed several supergene validation strategies for assessing the accuracy of a popular supergene method: the so-called “statistical binning” pipeline. In this article, we describe a more generalizable and model-based “supergene validation” protocol for assessing the accuracy of supergenes and supergene methods using model-based tests of phylogenetic congruency.•Supergenes are validated by adopting model-based tests of topological congruence•These model-based procedures out preform non-model based methods for supergene construction•The results of this protocol can be used to assess the overall performance of a supergene method across a phylogenomic dataset

Highlights

These approaches typically implement a two-part procedure whereby individual genealogical trees are first estimated for each genomic locus using maximum likelihood (ML) analyses, and the resulting gene tree estimates are used as input to reconstruct a species tree under the multispecies coalescent model using programs such as MPEST [1], ASRAL [2], ASTRID [3], or STEM [4]
While our primary goal is not to review in detail all possible phylogenetic tests that could be used for such a purpose, we provide several tools that proved useful for assessing supergene validation in our original study [15], and we mention additional techniques that could foreseeably be used for supergene validation in a similar manner
In our original demonstration of supergene validation using the avian phylogenomic analysis [15], we use the Likelihood Ratio Tests (LRTs) framework implemented in the program Concatepillar [26], which conducts a series of hierarchical LRTs to test the total number of distinct trees supported by the inferred supergene

Summary

Introduction

For the purpose of this article and to provide the same context as our original supergene validation study [15], we primarily discuss the use of our supergene validation protocol for assessing the accuracy of the statistical binning method that was used to infer supergenes for the avian phylogenomic analyses [11,21,22,23]. After a supergene (or set of supergenes) has been inferred (either based on a priori assumptions or via a more formal supergene method; Fig. 1a), the goal is to test whether the individual loci placed within a supergene should be treated as a single concatenated locus with a single phylogenetic tree topology (i.e., “true supergene”) or not (i.e., “false supergene”).

Results

Conclusion

Full Text

Paper version not known

Open DOI Link

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: MethodsX	Publication Date: Jan 1, 2019
Citations: 1	License type: cc-by

R Discovery Prime

R Discovery Prime

Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: MethodsX

Lead the way for us

Similar Papers

Weighted Statistical Binning: Enabling Statistically Consistent Genome-Scale Phylogenetic Analyses.
Md Shamsuzzoha Bayzid ... Nico Cellinese
PLOS ONE | VOL. 10
Md Shamsuzzoha Bayzid, et. al.Md Shamsuzzoha Bayzid ... Nico Cellinese
18 Jun 2015
PLOS ONE | VOL. 10

Looking forwards or looking backwards in avian phylogeography? A comment on Zink and Barrowclough 2008
Scott Edwards ... Staffan Bensch
Molecular Ecology | VOL. 18
Scott Edwards, et. al.Scott Edwards ... Staffan Bensch
29 Jun 2009
Looking forwards or looking backwards in avian phylogeography? A comment on Zink and Barrowclough 2008
Scott Edwards ... Staffan Bensch

Accounting for Uncertainty in Gene Tree Estimation: Summary-Coalescent Species Tree Inference in a Challenging Radiation of Australian Lizards.
Mozes P K Blom ... Jason G Bragg
Systematic biology | VOL. 66
Mozes P K Blom, et. al.Mozes P K Blom ... Jason G Bragg
06 Oct 2016
Systematic biology | VOL. 66

From Gene Trees to Species Trees
Bin Ma ... Ming Li
SIAM Journal on Computing | VOL. 30
Bin Ma, et. al.Bin Ma ... Ming Li
01 Jan 1999
SIAM Journal on Computing | VOL. 30

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Supergene validation: A model-based protocol for assessing the accuracy of non-model-based supergene methods

Abstract

Highlights

Summary

Talk to us

Similar Papers

More From: MethodsX