Abstract
BackgroundWe evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. However, its parameter space and thus its applicability to a wide taxonomic range has not been systematically studied. Divergence time, population size, time of gene flow, distance of outgroup and number of loci were examined in a sensitivity analysis.ResultThe sensitivity study shows that the primary determinant of the D-statistic is the relative population size, i.e. the population size scaled by the number of generations since divergence. This is consistent with the fact that the main confounding factor in gene flow detection is incomplete lineage sorting by diluting the signal. The sensitivity of the D-statistic is also affected by the direction of gene flow, size and number of loci. In addition, we examined the ability of the f-statistics, {widehat{f}}_G and {widehat{f}}_{hom} , to estimate the fraction of a genome affected by gene flow; while these statistics are difficult to implement to practical questions in biology due to lack of knowledge of when the gene flow happened, they can be used to compare datasets with identical or similar demographic background.ConclusionsThe D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances (divergence times) but it is sensitive to population size. The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations.
Highlights
We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species
The D-statistic, as a method to detect gene flow, is robust against a wide range of genetic distances but it is sensitive to population size
The D-statistic should only be applied with critical reservation to taxa where population sizes are large relative to branch lengths in generations
Summary
We evaluated the sensitivity of the D-statistic, a parsimony-like method widely used to detect gene flow between closely related species. This method has been applied to a variety of taxa with a wide range of divergence times. Introgression, refers to alleles from one species entering a different (and usually closely related) species through migration and hybridization It is a violation of the assumption in traditional phylogenetics that speciation is a sudden event and no exchange of genetic information occurs thereafter. Incomplete lineage sorting refers to an occurrence where lineages of a certain locus fail to coalesce in the branch directly in the past of their population divergence, resulting in three or more un-coalesced lineages existing in a population [1, 2]. Later methods can be generally separated into two categories: likelihood-based/Bayesian-
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.